Inconsistent syllable error rate between vak eval and predict

zhileizhao · September 14, 2023, 11:59pm

Hi vocalpy team,

Great work on developing the toolkits! Thank you so much for making them publicly accessible!
I’ve been using vak to train TweetyNet model on my own vocalization data for the annotation task. I’m a little confused about the results. Seems that the syllable accuracy is not correctly calculated by the vak eval function. Here is one example output from vak eval:
2023-09-14 19:37:19,402 - vak.core.eval - INFO - avg_acc: 0.85201
2023-09-14 19:37:19,402 - vak.core.eval - INFO - avg_acc_tfm: 0.85201
2023-09-14 19:37:19,402 - vak.core.eval - INFO - avg_levenshtein: 45.52941
2023-09-14 19:37:19,402 - vak.core.eval - INFO - avg_levenshtein_tfm: 45.52941
2023-09-14 19:37:19,402 - vak.core.eval - INFO - avg_segment_error_rate: 0.78289
2023-09-14 19:37:19,402 - vak.core.eval - INFO - avg_segment_error_rate_tfm: 0.78289
2023-09-14 19:37:19,402 - vak.core.eval - INFO - avg_loss: 0.52653

If I understand the results correctly, my model is able to achieve 85% frame accuracy, but the syllable accuracy is pretty bad. However, if I use vak predict to generate predicted labels, the results weren’t that bad. I compared the predicted labels to the ground-truth labels and calculated the levenshtein distance myself using the metrics.Levenshtein() function, the average syllable error is only 26.8%, instead of 78.2% as claimed by the vak eval function.
I’ve been thinking about this for a while, but couldn’t figure it out why the syllable error rate is so different between vak predict and eval. Any thoughts?
I’m using the ‘simple-seq’ format and only have single-char labels.

Thank you!
Zhilei

nicholdav · September 15, 2023, 1:07pm

Hi @zhileizhao, welcome to the forum, glad to hear you appreciate what we’re doing

Any thoughts?

Yeah, I think you have found a bug

The metric values with and without the transforms applied should definitely not be the same!

And you have done an excellent job of confirming it by computing the metric yourself, thank you!

Can you please go ahead and raise an issue on our tracker on GitHub reporting this bug?

We’ll be happy to give you credit for spotting it. I have a feeling the fix might be pretty simple. If you are feeling up to it I can walk you through contributing a fix to

zhileizhao · September 16, 2023, 7:00pm

Hi @nicholdav thank you so much for the quick response!
Yes, I will raise an issue on the GitHub repo. Glad that I contribute a little bit to this wonderful package!

nicholdav · September 18, 2023, 8:08pm

Excellent, thank you @zhileizhao! I saw you raised the issue, appreciate it. I have it on my to-do list to reply

nicholdav · September 20, 2023, 10:11pm

Thanks again for raising the issue @zhileizhao.
I’m just linking it here for future reference: Inconsistent syllable error rate between vak eval and predict · Issue #697 · vocalpy/vak · GitHub
Long story short, this was fixed in version 1.0 but I didn’t make note that we needed to fix it in version < 1.0 – my fault for letting that slip through the cracks.
A quick fix might be to upgrade to version 1.0 but we should fix the bug in the maintenance version too.

Let’s continue the discussion on GitHub – just let me know how I can help you.
We’d definitely be happy to help you contribute a fix if you’d like, or I can take care of it.
Appreciate your catching this bug!

Topic		Replies	Views
How to optimize training and evaluate `vak eval` results Q&A	1	177	March 6, 2024
Vak 0.8.0 + TweetyNet 0.9.0 released; vak 1.0 in development Announcements vak	0	276	February 16, 2023
Vak version 0.7.0 released! Announcements	0	241	December 5, 2022
Released: vak 0.6.0 and tweetynet 0.8.0 🐦 Announcements vak , tweetynet	0	273	July 9, 2022
Error in last step of generating predictions Q&A	2	235	February 25, 2023

Inconsistent syllable error rate between vak eval and predict

Related topics