Skip to content

Commit

Permalink
Merge pull request #162 from bagustris/master
Browse files Browse the repository at this point in the history
Add balancing for finetune and update data README
  • Loading branch information
felixbur authored Sep 18, 2024
2 parents fe5021a + 11fd578 commit 5fc59b1
Show file tree
Hide file tree
Showing 129 changed files with 2,152 additions and 524 deletions.
12 changes: 12 additions & 0 deletions .github/workflows/format_code.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
name: Check code formatting

on: [push, pull_request]

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: psf/black@stable
# - uses: psf/black@552baf822992936134cbd31a38f69c8cfe7c0f05

16 changes: 16 additions & 0 deletions .github/workflows/isort.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
name: Run isort

on: [push, pull_request]

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: isort/isort-action@v1
# - uses: isort/isort-action@master
with:
# isortVersion: 5.13.2
sortPaths: 'nkululeko'
configuration: '--profile black'

11 changes: 8 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ The preferred way to contribute to nkululeko is to fork the [main repository](ht

```bash
git clone https://github.com/YourLogin/nkululeko.git
cd spafe
cd nkululeko
```

3. Remove any previously installed nkululeko versions, then install your local copy with testing dependencies:
Expand All @@ -43,9 +43,14 @@ The preferred way to contribute to nkululeko is to fork the [main repository](ht
-> Please never work directly on the `master` branch!
```

6. Once you are done, make sure to format the code using black to fit spafe's codestyle.
6. Once you are done, make sure to format the code using black to fit Nkululeko's codestyle.

```black nkululeko/```
```bash
black nkululeko/
isort --profile black nkululeko/
# Alternatively and additionaly, use ruff:
ruff check --fix --output-format=full nkululeko
```

7. Make sure that the tests succeed and have enough coverage.

Expand Down
12 changes: 6 additions & 6 deletions data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Nkululeko database repository
# Data


This is the default top directory for Nkululeko data import.Each database should be in its own subfolder (you can also use `ln -sf`` to soft link original database path to these subfolders) and contain a README how to import the data to Nkululeko CSV or audformat.
This is the default top directory for Nkululeko data import. Each database should be in its own subfolder (you can also use `ln -sf` to soft link original database path to these subfolders) and contain a README how to import the data to Nkululeko CSV or audformat.
## Accessibility


Expand All @@ -15,12 +15,10 @@ The column `access` in the table below indicates the database's accessability. T
- `private`: the database is not publicly available on the internet and requires the private information of the owner of the dataset.


To support open science and reproducible research, we only accept PR and recipes for public dataset for now on.
## Databases

To support open science and reproducible research, we encourage to submit PR and recipes for public dataset for now on.
|Name|Target|Description|Access|
| :--- | :--- | :--- | :--- |
|emorynlp|emotion|English, From Friends TV|public|
|emorynlp|emotion|English Emotion Dataset from Friends TV Show|public|
|emns|emotion,intensity|British, singles peaker, UAR=.479|public|
|test|none|Test data for nkululeko|public|
|catsvsdogs|cats_dogs|kaggle test set|public|
Expand Down Expand Up @@ -72,11 +70,13 @@ To support open science and reproducible research, we only accept PR and recipes
|urdu|emotion|Urdu|public|
|polish|emotion|Polish|public|
|cmu-mosei|sentiment,emotion|English, original link dead|public|
|SVD|pathologicalspeech|German|public|
|svd|pahtological speech|German speech data for detecting various pathological voices|public|
|msp-improv|emotion,VAD,naturalness|English|restricted|
|shemo|emotion|Persian|public|
|esd|emotion|English,Chinese|public|


This recipe contains information about 56 datasets.
## Performance

![Nkululeko performance](../meta/images/nkululeko_ser_20240719.png)
3 changes: 2 additions & 1 deletion data/androids/process_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@
"""

import pandas as pd
import os

import audeer
import pandas as pd

dataset_name = 'androids'
data_root = './Androids-Corpus/'
Expand Down
8 changes: 5 additions & 3 deletions data/banglaser/process_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,14 @@
GG = Actor ID, 01-34 (odd: male, even: female)
"""

import pandas as pd
from nkululeko.utils.files import find_files
import argparse
from sklearn.model_selection import train_test_split
from pathlib import Path

import pandas as pd
from sklearn.model_selection import train_test_split

from nkululeko.utils.files import find_files


def process_database(data_dir, output_dir):
# check if data_dir exists
Expand Down
1 change: 0 additions & 1 deletion data/crema-d/load_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import audb


# set download directory to current
cwd = os.getcwd()
audb.config.CACHE_ROOT = cwd
Expand Down
Loading

0 comments on commit 5fc59b1

Please sign in to comment.