Hi @YMK123 sorry that vak
is not giving you clear enough feedback here.
I’m posting the full text of the error you got for context:
determined that purpose of config file is: train
will add 'csv_path' option to 'TRAIN' section
purpose for dataset: train
will split dataset
making array files containing spectrograms from audio files in: /Users/Lab/VAK_test/test
/Users/Lab/VAK_test/test/01-F81AD-191209_1807_halfhr1_callsegs_st_53989425-54099675
01-F81AD-191209_1807_halfhr1_callsegs_st_53989425-54099675
Traceback (most recent call last):
File "/Users/Lab/anaconda3/envs/vak-env/bin/vak", line 10, in <module>
sys.exit(main())
File "/Users/Lab/anaconda3/envs/vak-env/lib/python3.9/site-packages/vak/__main__.py", line 45, in main
cli.cli(command=args.command, config_file=args.configfile)
File "/Users/Lab/anaconda3/envs/vak-env/lib/python3.9/site-packages/vak/cli/cli.py", line 30, in cli
COMMAND_FUNCTION_MAP[command](toml_path=config_file)
File "/Users/Lab/anaconda3/envs/vak-env/lib/python3.9/site-packages/vak/cli/prep.py", line 132, in prep
vak_df, csv_path = core.prep(
File "/Users/Lab/anaconda3/envs/vak-env/lib/python3.9/site-packages/vak/core/prep.py", line 205, in prep
vak_df = dataframe.from_files(
File "/Users/Lab/anaconda3/envs/vak-env/lib/python3.9/site-packages/vak/io/dataframe.py", line 137, in from_files
spect_files = audio.to_spect(
File "/Users/Lab/anaconda3/envs/vak-env/lib/python3.9/site-packages/vak/io/audio.py", line 174, in to_spect
audio_annot_map = source_annot_map(audio_files, annot_list)
File "/Users/Lab/anaconda3/envs/vak-env/lib/python3.9/site-packages/vak/annotation.py", line 210, in source_annot_map
keys = [recursive_stem(annot.audio_path) for annot in annot_list]
File "/Users/Lab/anaconda3/envs/vak-env/lib/python3.9/site-packages/vak/annotation.py", line 210, in <listcomp>
keys = [recursive_stem(annot.audio_path) for annot in annot_list]
File "/Users/Lab/anaconda3/envs/vak-env/lib/python3.9/site-packages/vak/annotation.py", line 175, in recursive_stem
raise ValueError(f"unable to compute stem of {path}")
ValueError: unable to compute stem of /Users/Lab/VAK_test/test/01-F81AD-191209_1807_halfhr1_callsegs_st_53989425-54099675
My guess is that what’s going on here is vak
expects there to be a 1:1 match for each audio and annotation file, where the name of the annotation file is ${audio-file-name.ext}.txt
.
So you want to have something like:
01-F81AD-191209_1807_halfhr1_callsegs_st_53989425-54099675/
001.wav
001.wav.csv
002.wav
002.wav.csv
003.wav
003.wav.csv
Is that true in your case?
Can you list the names of some of your audio and annotation files here?
It just dawned on me that the examples in the “how to” here actually do not show the files this way:
We show this instead. All those .csv
files should actually be named .wav.csv
BB_SGP16-1___20160521_214723.csv
BB_SGP16-1___20160521_214723.txt
BB_SGP16-1___20160521_214723.wav
BBY15-4___20150907_211645.csv
BBY15-4___20150907_211645.txt
BBY15-4___20150907_211645.wav
... # more files here
DB_1-WWS16-2___20160822_203501.csv
DB_1-WWS16-2___20160822_203501.txt
DB_1-WWS16-2___20160822_203501.wav
Could you please raise an issue on the vak
repo pointing out we need to fix that in the docs?
I’m sorry that how-to isn’t clearer. I’ll be sure to credit you for bringing it to our attention.
The data in the “autoannotate” tutorial does follow this naming convention though.
You have bird1-timestamp.cbin
as an audio file and bird1-timestamp.cbin.not.mat
as an annotation file. The recursive_stem
format that gives you the cryptic error basically removes extensions until it finds a valid audio file format (currently wav
or cbin
)
Note that this convention does not have to apply if you’re using some annotation format where there’s only a single annotation file but multiple audio files – in that case vak
(using crowsetta
) just gets the source “annotated” files (audio in this case) out of the single annotation file.
I think your fix might be working because it creates the correct situation, but a better way to fix is to use the naming convention that vak
expects as above, so that you don’t have to modify the internals of crowsetta
which might come back to haunt you later if you forget you did it