Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG REPORT] Whisper and the app is saving in the general #232

Open
AcTePuKc opened this issue Jun 13, 2024 · 0 comments
Open

[BUG REPORT] Whisper and the app is saving in the general #232

AcTePuKc opened this issue Jun 13, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@AcTePuKc
Copy link

Describe the bug
When using the batch processing feature in the Whisper tab of the audio-webui, the application processes the files but does not move them from the temporary folder. This results in the batch processing output files remaining in the temporary directories (e.g., F:\Voice\audio-webui\data\temp and subfolders with random names) instead of being moved to the designated output location in the app's subfolders.

To Reproduce
Steps to reproduce the behavior:

  1. Go to the Text to Speech tab, select Bark, and generate any text.
  2. This generates files in the %LOCALAPPDATA%\Temp.
  3. Go to the Whisper tab in the audio-webui.
  4. Use the 'Batch input' feature to upload multiple .wav files (e.g., 38 files).
  5. Start the batch processing.
  6. Observe that the files are processed and output files are generated but remain in the temporary folders instead of being moved to the designated output location.

Expected behavior
The expected behavior is that after processing, the output files should be moved from the temporary folders to the specified output directory within the app's subfolders, ensuring proper file management and avoiding clutter in the temporary directories.

Screenshots
image
Additional context

  • The issue occurs consistently with a batch of 38 .wav files.
  • The problem might be related to the handling of temporary folders and file paths within the Whisper batch processing function or the temporary folders in general.
  • Example of environment:
    • Python version: 3.10.11
    • Gradio version: 3.49.0
    • Other dependencies and versions as listed in the requirements.

Directory Structure Example:

F:\Voice\audio-webui\data\temp\
├── 1f6idh9fl.txt
├── 2y976ccx9.txt
├── 3fu2c7ign.txt
...
├── tmp00em15ar
├── tmp07ivbjuh
├── tmp0ge478ud
├── tmp1l5t0gmj
...
+---0351ac6be3bc7041cfef87b319d52126ab5860b8
¦       6.wav
+---04695579ce2d741c2c1d1c3fcb94bec74c6dbcc0
¦       37.wav
...

The generated text files are stored correctly in the app's subfolders, but the associated images, wav, and mp4 files are being saved in the user's temporary directories (%LOCALAPPDATA%\Temp). This discrepancy may be due to path handling in the batch processing code.
Suggestion - that might clutter the console but would be ideal to show where those files are saved.

iimport os

# Function to detect the main app folder dynamically
def get_main_app_folder():
 return os.path.dirname(os.path.abspath(__file__))

# Define base paths using the detected main app folder
main_app_folder = get_main_app_folder()
base_input_path = os.path.join(main_app_folder, "data", "inputs")
base_output_path = os.path.join(main_app_folder, "data", "outputs")
base_output_path_png = os.path.join(main_app_folder, "data", "outputs", "png")
base_output_path_wav = os.path.join(main_app_folder, "data", "outputs", "wav")
base_output_path_mp4 = os.path.join(main_app_folder, "data", "outputs", "mp4")
base_output_path_txt = os.path.join(main_app_folder, "data", "outputs", "txt")

# Ensure the directories exist
os.makedirs(base_input_path, exist_ok=True)
os.makedirs(base_output_path, exist_ok=True)
os.makedirs(base_output_path_png, exist_ok=True)
os.makedirs(base_output_path_wav, exist_ok=True)
os.makedirs(base_output_path_mp4, exist_ok=True)
os.makedirs(base_output_path_txt, exist_ok=True)

# Rest of the processing code or whatever
# ...

print(f"Base input path: {base_input_path}")
print(f"Base output path: {base_output_path}")
print(f"Base output path for PNG: {base_output_path_png}")
print(f"Base output path for WAV: {base_output_path_wav}")
print(f"Base output path for MP4: {base_output_path_mp4}")
print(f"Base output path for TXT: {base_output_path_txt}")


@AcTePuKc AcTePuKc added the bug Something isn't working label Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants