Hi all,
Proud to announce that Crowsetta has been accepted as a package at pyOpenSci and published in JOSS
pyOpenSci review is here if you’re interested: crowsetta: A Python tool to work with any format for annotating animal vocalizations and bioacoustics data. · Issue #68 · pyOpenSci/software-submission · GitHub
The JOSS article is here: https://joss.theoj.org/papers/10.21105/joss.05338
Version 5.0.0 is released, is available on PyPI and conda-forge, and includes all changes from the review.
Release notes:
5.0.0 – 2023-03-29
This release is the approved version after a successful pyOpenSci review!
opened 07:39PM - 03 Jan 23 UTC
6/pyOS-approved 🚀🚀🚀
7/under-joss-review
Submitting Author: David Nicholson (@NickleDave )
All current maintainers: (@Ni… ckleDave )
Package Name: crowsetta
One-Line Description of Package: A Python tool to work with any format for annotating animal vocalizations and bioacoustics data.
Repository Link: https://github.com/vocalpy/crowsetta
Version submitted:
Editor: @cmarmo
Reviewer 1: @rhine3
Reviewer 2: @shaupert
Archive: https://zenodo.org/record/7781587
Version accepted: v [5.0](https://github.com/vocalpy/crowsetta/releases/tag/5.0.0)
Date accepted (month/day/year): 03/28/2023
---
## Description
- Include a brief paragraph describing what your package does:
crowsetta provides a Pythonic way to work with annotation formats for animal vocalizations and bioacoustics data. It has has built-in support for many widely used [formats](https://crowsetta.readthedocs.io/en/latest/formats/index.html#formats-index) such as [Audacity label tracks](https://crowsetta.readthedocs.io/en/latest/formats/seq/aud-txt.html#aud-txt), [Praat .TextGrid files](https://crowsetta.readthedocs.io/en/latest/formats/seq/textgrid.html#textgrid), and [Raven .txt files](https://crowsetta.readthedocs.io/en/latest/formats/bbox/raven.html#raven). The package focuses on providing interoperability, as well as making it easier to share data in plaintext flat-file formats (csv) and common serialization formats (json). In addition, abstractions in the package are designed to make it easy to use these simplified formats for common downstream tasks. Examples of such tasks are fitting statistical models of vocal behavior, and building datasets to train machine learning models that predict new annotations.
## Scope
- Please indicate which [category or categories][PackageCategories] this package falls under:
- [ ] Data retrieval
- [x] Data extraction
- [x] Data munging
- [ ] Data deposition
- [x] Reproducibility
- [ ] Geospatial
- [ ] Education
- [ ] Data visualization*
> *Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see [notes on categories][NotesOnCategories] of our guidebook.*
n/a
- **For all submissions**, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
- Who is the target audience and what are scientific applications of this package?
Anyone that works with animal vocalizations or other bioacoustics data that is annotated in some way. Examples (from the landing page of the docs): neuroscientists studying how songbirds learn their song, or why mice emit ultrasonic calls. Ecologists studying dialects of finches distributed across Asia, linguists studying accents in the Caribbean, a speech pathologist looking for phonetic changes that indicate early onset Alzheimer’s disease.
- Are there other Python packages that accomplish the same thing? If so, how does yours differ?
Not to my knowledge.
There are many format-specific packages in various states of maintenance, e.g. a search for the format `textgrid` used by the application Praat on PyPI currently returns 33 packages (include crowsetta):
https://pypi.org/search/?q=textgrid&o=
There are also several larger packages whose functionality includes the ability to parse specific formats, e.g. [Parselmouth](https://github.com/YannickJadoul/Parselmouth) wraps all of Praat and thus can load TextGrid files. But the goal of crowsetta is mainly to provide interoperability, and to do so for a wide array of formats, so that other higher-level libraries can leverage its functionality. This emphasis on data extraction + munging, like the possibly destructive transformation to other formats, makes it in scope for pyOpenSci.
## Technical checks
For details about the pyOpenSci packaging requirements, see our [packaging guide][PackagingGuide]. Confirm each of the following by checking the box. This package:
- [x] does not violate the Terms of Service of any service it interacts with.
- [x] has an [OSI approved license][OsiApprovedLicense].
- [x] contains a README with instructions for installing the development version.
- [x] includes documentation with examples for all functions.
- [x] contains a vignette with examples of its essential functions and uses.
- [x] has a test suite.
- [x] has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
## Publication options
- [x] Do you wish to automatically submit to the [Journal of Open Source Software][JournalOfOpenSourceSoftware]?
yes
- [ ] If so:
<details>
<summary>JOSS Checks</summary>
- [x] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
- [x] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
- [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`.
- [ ] The package is deposited in a long-term repository with the DOI:
*Note: Do not submit your package separately to JOSS*
</details>
## Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
- [x] Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.
## Code of conduct
- [x] I agree to abide by [pyOpenSci's Code of Conduct][PyOpenSciCodeOfConduct] during the review process and in maintaining my package should it be accepted.
## Please fill out our survey
- [x] [Last but not least please fill out our pre-review survey](https://forms.gle/F9mou7S3jhe8DMJ16). This helps us track
submission and improve our peer review process. We will also ask our reviewers
and editors to fill this out.
**P.S.** *Have feedback/comments about our review process? Leave a comment [here][Comments]
## Potential reviewers
As discussed with @lwasser I am tagging some potential reviewers given that pyOpenSci does not currently have anyone that's familiar with this area (besides me)
@rhine3 @shyamblast @YannickJadoul @avakiai @danibene @nilor
edit: just updating here for clarity that I am only *suggesting* reviewers to help bootstrap the process since we're still a growing org. @lwasser will assign an editor that will then reach out directly to potential reviewers. Sorry for any confusion, and we appreciate interest of people that have replied so far.
## Editor and Review Templates
[Editor and review templates can be found here][Templates]
[PackagingGuide]: https://www.pyopensci.org/contributing-guide/authoring/index.html#packaging-guide
[PackageCategories]: https://www.pyopensci.org/contributing-guide/open-source-software-peer-review/aims-and-scope.html?highlight=data#package-categories
[NotesOnCategories]: https://www.pyopensci.org/contributing-guide/open-source-software-peer-review/aims-and-scope.html?highlight=data#notes-on-categories
[JournalOfOpenSourceSoftware]: http://joss.theoj.org/
[JossSubmissionRequirements]: https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements
[JossPaperRequirements]: https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain
[PyOpenSciCodeOfConduct]: https://www.pyopensci.org/contributing-guide/open-source-software-peer-review/code-of-conduct.html?highlight=code%20conduct
[OsiApprovedLicense]: https://opensource.org/licenses
[Templates]: https://www.pyopensci.org/contributing-guide/appendices/templates.html
[Comments]: https://github.com/pyOpenSci/governance/issues/8
Added
Add information on contributing and setting up a development environment #212 . Fixes #30 .
Add method to convert generic sequence format to a pandas DataFrame #216 .
Add additional vignettes to docs: on removing “silent” labels from TextGrid annotations, on converting to the simple sequence and generic sequence formats #216 . Fixes #152 and #197 .
Add format class for Audacity extended label track format #226 . Fixes #222 and #213 .
Add the ability for a crowsetta.Annotation to have multiple sequences #243 . Fixes #42 .
Rewrite TextGrid class to better handle file formats: parse both “short” and default format in either UTF-8 or UTF-16
encoding; remove empty intervals from interval tiers by default; can convert multiple interval tiers to a single crowsetta.Annotation
with multiple crowsetta.Sequences #243 . Fixes #241
Removed
Remove Segment.from_row
method, no longer used #232 . Fixes #231
Fixed
Revise landing page of docs, and some vignettes. Make other changes to clean up the docs build process
#216 .
Coerce path-like attributes of GenericSeq
dataframe schema to be strings. This helps ensure these columns are always native Pandas types
#237 .
Fix how the crowsetta.Segment
class converts onset sample and offset sample to int; correctly handle
multiple numpy integer subtypes #238 .
Huge thank yous to pyOpenSci reviewers Tessa Rhinehart, Sylvain Haupert and to YannickJadoul for your expert opinions on all things Praat TextGrid, and to Chiara Marmo for being the editor that brought it all together. Really appreciate all your contributions and the time you all put into this.