Crowsetta accepted as a pyOpenSci package, published in JOSS, version 5.0.0 released

nicholdav · April 13, 2023, 1:42pm

Hi all,

Proud to announce that Crowsetta has been accepted as a package at pyOpenSci and published in JOSS

pyOpenSci review is here if you’re interested: crowsetta: A Python tool to work with any format for annotating animal vocalizations and bioacoustics data. · Issue #68 · pyOpenSci/software-submission · GitHub

The JOSS article is here: https://joss.theoj.org/papers/10.21105/joss.05338

Version 5.0.0 is released, is available on PyPI and conda-forge, and includes all changes from the review.

Release notes:

5.0.0 – 2023-03-29

This release is the approved version after a successful pyOpenSci review!

github.com/pyOpenSci/software-submission

crowsetta: A Python tool to work with any format for annotating animal vocalizations and bioacoustics data.

opened 07:39PM - 03 Jan 23 UTC

NickleDave

6/pyOS-approved 🚀🚀🚀 7/under-joss-review

Submitting Author: David Nicholson (@NickleDave ) All current maintainers: (@Ni…ckleDave ) Package Name: crowsetta One-Line Description of Package: A Python tool to work with any format for annotating animal vocalizations and bioacoustics data. Repository Link: https://github.com/vocalpy/crowsetta Version submitted: Editor: @cmarmo Reviewer 1: @rhine3 Reviewer 2: @shaupert Archive: https://zenodo.org/record/7781587 Version accepted: v [5.0](https://github.com/vocalpy/crowsetta/releases/tag/5.0.0) Date accepted (month/day/year): 03/28/2023 --- ## Description - Include a brief paragraph describing what your package does: crowsetta provides a Pythonic way to work with annotation formats for animal vocalizations and bioacoustics data. It has has built-in support for many widely used [formats](https://crowsetta.readthedocs.io/en/latest/formats/index.html#formats-index) such as [Audacity label tracks](https://crowsetta.readthedocs.io/en/latest/formats/seq/aud-txt.html#aud-txt), [Praat .TextGrid files](https://crowsetta.readthedocs.io/en/latest/formats/seq/textgrid.html#textgrid), and [Raven .txt files](https://crowsetta.readthedocs.io/en/latest/formats/bbox/raven.html#raven). The package focuses on providing interoperability, as well as making it easier to share data in plaintext flat-file formats (csv) and common serialization formats (json). In addition, abstractions in the package are designed to make it easy to use these simplified formats for common downstream tasks. Examples of such tasks are fitting statistical models of vocal behavior, and building datasets to train machine learning models that predict new annotations. ## Scope - Please indicate which [category or categories][PackageCategories] this package falls under: - [ ] Data retrieval - [x] Data extraction - [x] Data munging - [ ] Data deposition - [x] Reproducibility - [ ] Geospatial - [ ] Education - [ ] Data visualization* > *Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see [notes on categories][NotesOnCategories] of our guidebook.* n/a - **For all submissions**, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each): - Who is the target audience and what are scientific applications of this package? Anyone that works with animal vocalizations or other bioacoustics data that is annotated in some way. Examples (from the landing page of the docs): neuroscientists studying how songbirds learn their song, or why mice emit ultrasonic calls. Ecologists studying dialects of finches distributed across Asia, linguists studying accents in the Caribbean, a speech pathologist looking for phonetic changes that indicate early onset Alzheimer’s disease. - Are there other Python packages that accomplish the same thing? If so, how does yours differ? Not to my knowledge. There are many format-specific packages in various states of maintenance, e.g. a search for the format `textgrid` used by the application Praat on PyPI currently returns 33 packages (include crowsetta): https://pypi.org/search/?q=textgrid&o= There are also several larger packages whose functionality includes the ability to parse specific formats, e.g. [Parselmouth](https://github.com/YannickJadoul/Parselmouth) wraps all of Praat and thus can load TextGrid files. But the goal of crowsetta is mainly to provide interoperability, and to do so for a wide array of formats, so that other higher-level libraries can leverage its functionality. This emphasis on data extraction + munging, like the possibly destructive transformation to other formats, makes it in scope for pyOpenSci. ## Technical checks For details about the pyOpenSci packaging requirements, see our [packaging guide][PackagingGuide]. Confirm each of the following by checking the box. This package: - [x] does not violate the Terms of Service of any service it interacts with. - [x] has an [OSI approved license][OsiApprovedLicense]. - [x] contains a README with instructions for installing the development version. - [x] includes documentation with examples for all functions. - [x] contains a vignette with examples of its essential functions and uses. - [x] has a test suite. - [x] has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others. ## Publication options - [x] Do you wish to automatically submit to the [Journal of Open Source Software][JournalOfOpenSourceSoftware]? yes - [ ] If so: <details> <summary>JOSS Checks</summary> - [x] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [x] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: Do not submit your package separately to JOSS* </details> ## Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly? This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links. - [x] Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review. ## Code of conduct - [x] I agree to abide by [pyOpenSci's Code of Conduct][PyOpenSciCodeOfConduct] during the review process and in maintaining my package should it be accepted. ## Please fill out our survey - [x] [Last but not least please fill out our pre-review survey](https://forms.gle/F9mou7S3jhe8DMJ16). This helps us track submission and improve our peer review process. We will also ask our reviewers and editors to fill this out. **P.S.** *Have feedback/comments about our review process? Leave a comment [here][Comments] ## Potential reviewers As discussed with @lwasser I am tagging some potential reviewers given that pyOpenSci does not currently have anyone that's familiar with this area (besides me) @rhine3 @shyamblast @YannickJadoul @avakiai @danibene @nilor edit: just updating here for clarity that I am only *suggesting* reviewers to help bootstrap the process since we're still a growing org. @lwasser will assign an editor that will then reach out directly to potential reviewers. Sorry for any confusion, and we appreciate interest of people that have replied so far. ## Editor and Review Templates [Editor and review templates can be found here][Templates] [PackagingGuide]: https://www.pyopensci.org/contributing-guide/authoring/index.html#packaging-guide [PackageCategories]: https://www.pyopensci.org/contributing-guide/open-source-software-peer-review/aims-and-scope.html?highlight=data#package-categories [NotesOnCategories]: https://www.pyopensci.org/contributing-guide/open-source-software-peer-review/aims-and-scope.html?highlight=data#notes-on-categories [JournalOfOpenSourceSoftware]: http://joss.theoj.org/ [JossSubmissionRequirements]: https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements [JossPaperRequirements]: https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain [PyOpenSciCodeOfConduct]: https://www.pyopensci.org/contributing-guide/open-source-software-peer-review/code-of-conduct.html?highlight=code%20conduct [OsiApprovedLicense]: https://opensource.org/licenses [Templates]: https://www.pyopensci.org/contributing-guide/appendices/templates.html [Comments]: https://github.com/pyOpenSci/governance/issues/8

Added

Add information on contributing and setting up a development environment #212. Fixes #30.
Add method to convert generic sequence format to a pandas DataFrame #216.
Add additional vignettes to docs: on removing “silent” labels from TextGrid annotations, on converting to the simple sequence and generic sequence formats #216. Fixes #152 and #197.
Add format class for Audacity extended label track format #226. Fixes #222 and #213.
Add the ability for a crowsetta.Annotation to have multiple sequences #243. Fixes #42.
Rewrite TextGrid class to better handle file formats: parse both “short” and default format in either UTF-8 or UTF-16
encoding; remove empty intervals from interval tiers by default; can convert multiple interval tiers to a single crowsetta.Annotation
with multiple crowsetta.Sequences #243. Fixes #241

Removed

Remove Segment.from_row method, no longer used #232. Fixes #231

Fixed

Revise landing page of docs, and some vignettes. Make other changes to clean up the docs build process
#216.
Coerce path-like attributes of GenericSeq dataframe schema to be strings. This helps ensure these columns are always native Pandas types
#237.
Fix how the crowsetta.Segment class converts onset sample and offset sample to int; correctly handle
multiple numpy integer subtypes #238.

Huge thank yous to pyOpenSci reviewers Tessa Rhinehart, Sylvain Haupert and to YannickJadoul for your expert opinions on all things Praat TextGrid, and to Chiara Marmo for being the editor that brought it all together. Really appreciate all your contributions and the time you all put into this.

Topic		Replies	Views
Crowsetta version 4.0.0 released! Announcements crowsetta	0	307	June 25, 2022
Vak version 0.7.0 released! Announcements	0	241	December 5, 2022
Forum Acusticum 2023 proceedings paper on VocalPy Announcements vocalpy	0	172	September 21, 2023
New release of dataset "Bengalese Finch Song Repository" + update to vak tutorial Announcements	0	166	October 6, 2022
Vak prep issue with "simple-seq" annotation format and wave files Q&A vak , crowsetta	7	297	June 10, 2022

Crowsetta accepted as a pyOpenSci package, published in JOSS, version 5.0.0 released

5.0.0 – 2023-03-29

Added

Removed

Fixed

Related topics