xtts-webui๋Š” ์ž์—ฐ์Šค๋Ÿฌ์šด ์Œ์„ฑ์ƒ์„ฑ xtts๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์‰ฝ๊ฒŒ ๋งŒ๋“  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋‹ค.

https://github.com/daswer123/xtts-webui.git

 

GitHub - daswer123/xtts-webui: Webui for using XTTS and for finetuning it

Webui for using XTTS and for finetuning it. Contribute to daswer123/xtts-webui development by creating an account on GitHub.

github.com

 

*  ์œˆ๋„์šฐ์— ์„ค์น˜ ํ•˜๊ธฐ

- ๊นƒํ—™์— ์„ค๋ช…๋œ๋Œ€๋กœ ์•„์ฃผ ๊ฐ„๋‹จํ•˜๊ฒŒ ์„ค์น˜๋œ๋‹ค. ๊ฐœ๋ฐœ์ž๊ฐ€ ๋ฐฐ์น˜ํŒŒ์ผ์„ ์ž˜ ๋งŒ๋“ค์—ˆ๋‹ค.

1. git clone https://github.com/daswer123/xtts-webui.git xtts
2. cd xtts
3. 'install.bat' ์‹คํ–‰
- ํ•œ์ฐธ์„ ์„ค์น˜ํ•œ๋‹ค.
4. 'start_xtts_webui.bat' ์‹คํ–‰
๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํ•œ์ฐธ์„ ์„ค์น˜ํ•œ๋‹ค.
5. ๋ถ€๋“œ๋Ÿฝ๊ฒŒ ์‹คํ–‰๋œ๋‹ค. ์›นํŽ˜์ด์ง€๊ฐ€ ๋œฌ๋‹ค.

 

 

์‚ฌ์šฉ๋ฒ•์€ ๋‹ค์Œ ํฌ์ŠคํŒ…์— ์ •๋ฆฌํ•œ๋‹ค.

๊ฒฐ๋ก ๋ถ€ํ„ฐ ๋งํ•˜์ž๋ฉด, coqui tts - xtts๋Š” ์ง€๋‚œ 2024๋…„ 1์›” 3์ผ๋ถ€๋กœ shutdown ๋˜์—ˆ๋‹ค.
https://twitter.com/_josh_meyer_/status/1742522906041635166
 

X์˜ Josh Meyer๋‹˜(@_josh_meyer_)

Coqui is shutting down. It's sad news to start the new year, but I want to take a minute to recognize everything we accomplished and thank the great people who made it possible. First things first: the Team I'm honored to have worked with such brilliant, d

twitter.com

 

* ๋™๊ธฐ

๋‚ด๊ฐ€ ์›ํ•˜๋Š” ๋ชฉ์†Œ๋ฆฌ๋กœ ๋ฌด์ œํ•œ ์ƒ์„ฑ๊ฐ€๋Šฅํ•œ ๋ฌด๋ฃŒ tts๊ฐ€ ํ•„์š”ํ•˜๋‹ค.

 

* ๊ณผ์ •

1. ์—ฌ๋Ÿฌ ํ…Œ์ŠคํŠธ ๊ฒฐ๊ณผ, openVoice v2์™€ coqui-tts(xtts) v2๊ฐ€ ๊ฐ€์žฅ ํ›Œ๋ฅญํ–ˆ๋‹ค.

2. ์ด ๊ณผ์ •์—์„œ ์ผ๋ž˜๋ธ๋žฉ์Šค https://elevenlabs.io/ ์˜ ๋†€๋ผ์šด ๋ฐœ์ „ ์†๋„์— ์˜คํ”ˆ์†Œ์Šค๋“ค์ด ๋” ์ด์ƒ ๋”ฐ๋ผ๊ฐ€์ง€ ๋ชปํ•จ์„ ์•Œ๊ฒŒ ๋˜์—ˆ๋‹ค.

3. ์‹ค์ œ๋กœ ๋Œ๋ ค๋ณด๋‹ˆ ๊ทธ ์ˆ˜์ค€์ฐจ์ด๊ฐ€ ์ปธ๋‹ค. ์ด์ œ ์ผ๋ž˜๋ธ๋žฉ์Šค์˜ ๊ฒฝ์Ÿ์ƒ๋Œ€๋Š” ์˜คํ”ˆ์†Œ์Šค ์ง„ํ˜•์ด ์•„๋‹ˆ๋ผ openAI๋กœ ๋ณด์ธ๋‹ค. ์‹ฌ์ง€์–ด xtts๋Š” ์ด๋ฏธ 1์›”์ดˆ์— ๋ฌธ์„ ๋‹ซ์•˜๋‹ค.

4. chatGPT 4o์™€ ์ผ๋ž˜๋ธ๋žฉ์Šค์˜ ํ’ˆ์งˆ์„ ๋ณด๋ฉด ์ด์ œ ์˜คํ”ˆ์†Œ์Šค ์ง„ํ˜•์— ๊ทธ๊ฐ„์˜ ๋…ธ๋ ฅ๋“ค์€ ๋ฌผ๊ฑฐํ’ˆ์ด ๋ ๊ฑฐ ๊ฐ™์€ ๋Š๋‚Œ์ด ๋“ ๋‹ค. stable diffusion์˜ ๊นƒํ—™๋งŒ ๋ณด์•„๋„ ๋งˆ์ง€๋ง‰ commit์ด 2๋…„์ „์ด๋‹ค. ์ด์ œ ์ƒํƒœ๊ณ„๋Š” stable diffusion์ด ์•„๋‹ˆ๋ผ stable diffusion webUI๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ๋Œ์•„๊ฐ„๋‹ค. ํ•˜์ง€๋งŒ stable diffusion 3๋Š” 3๋‹ฌ์ „์— ์ด๋ฏธ ๋‚˜์™”๋‹ค. ๊ณต๊ฐœ๊ฐ€ ๋˜์ง€ ์•Š์•˜์„ ๋ฟ์ด๋‹ค.

5. ์ด์ œ ๋‚จ์€๊ฑด meta ๋ฟ์œผ๋กœ ๋ณด์ธ๋‹ค. (์ œ๋ฐœ grok์ด ํž˜์„ ๋‚ด์ฃผ๊ธธ)

6. ๊ทธ๋Ÿผ์—๋„ '๋ฌด์ œํ•œ'๊ณผ '๋ฌด๋ฃŒ'๋ผ๋Š” ์žฅ์ ์€ '์‚ฌ์šฉ์˜ ๋ถˆํŽธํ•จ'๊ณผ 'ํ’ˆ์งˆ์˜ ๋–จ์–ด์ง'์ด๋ผ๋Š” ๊ฐ•๋ ฅํ•œ ๋‹จ์ ์—๋„ ์–ด๋Š ๋ถ€๋ถ„ ๋Œ€ํ•ญ๊ฐ€๋Šฅํ•œ ๋Œ€๊ฒฐ์ธ์ž๋กœ ๋ณด์˜€๋‹ค.

 

* xtts ์‚ฌ์šฉํ›„ ๋Š๋‚€ ์ 

์žฅ์  - ๊ฝค ์ž์—ฐ์Šค๋Ÿฌ์šด tts๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค.

๋‹จ์  - ko๋ชจ๋ธ์˜ ๊ฒฝ์šฐ ์˜์–ด๋ฅผ ์ „ํ˜€ ์ฝ์ง€ ๋ชปํ•œ๋‹ค. g2pk2 ๋ชจ๋“ˆ์„ ์ด์šฉํ•ด๋„ ๋งŽ์ด ๋ถ€์กฑํ•˜๋‹ค. ์˜์–ด๋ถ€๋ถ„์—์„œ ๋ฐœ์Œ์ด ์™„์ „ ๋ญ‰๊ฐœ์ง€๋ฉฐ ๊ธฐ๊ณ„์Œ์ด ๋‚˜์˜จ๋‹ค. ์ด๊ฒƒ์ด ๋„ˆ๋ฌด๋‚˜ ์น˜๋ช…์ ์ด๋‹ค. en๋ชจ๋ธ์€ ์˜ค์ง ์˜์–ด๋งŒ ์ฝ์„ ์ˆ˜ ์žˆ๊ณ  ko๋ชจ๋ธ์€ ์˜ค์ง ํ•œ๊ตญ์–ด๋งŒ ์ฝ์„ ์ˆ˜ ์žˆ๋‹ค.

 

 

 

* ์ตœ์ข… ๊ฒฐ๋ก 

๋‹ค์‹œ open voice v2๋กœ ๋Œ์•„์™”๋‹ค. ์ž์—ฐ์Šค๋Ÿฝ์ง€๋Š” ๋ชปํ•ด๋„ ์•ˆ์ •์ ์œผ๋กœ ๊ธด ๊ธ€์„ ์ฝ์–ด์ค€๋‹ค. ์˜์–ด์™€ ํ•œ๊ธ€์ด ์„ž์—ฌ ์žˆ์–ด๋„ ๋ฌธ์ œ์—†๋‹ค. xtts์™€ ๋‹ฌ๋ฆฌ ๊ณ„์† ์—…๋ฐ์ดํŠธ ๋  ๊ฒƒ ๊ฐ™๋‹ค. ์„ค์น˜ ํ™˜๊ฒฝ ์กฐ์„ฑ์ด ์–ด๋ ต์ง€๋งŒ ์‹คํ–‰๋งŒ ๋œ๋‹ค๋ฉด ์†๋„๊ฐ€ ๋น ๋ฅด๋‹ค. ๊ทธ๋ž˜์„œ, open voice v2๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์Œ์„ฑ์˜ ํ’ˆ์งˆ์„ ๋†’์ด๋Š” (rvc๋กœ ํ›„์ฒ˜๋ฆฌํ•˜๋Š” ๋“ฑ์˜) ๋ฐฉํ–ฅ์„ ๊ณ ๋ฏผํ•ด๋ณด๋ คํ•œ๋‹ค.

์–ด๋–ป๊ฒŒ๋“   - ๋ฉฐ์น ๊ฐ„ ์—ฐ๊ตฌํ•œ ๊ฒƒ์ด ์•„๊น๊ธฐ๋„ํ•˜๊ณ  - openVoice V2๋กœ ์ข‹์€ ํ’ˆ์งˆ์˜ tts๋ฅผ ๋ฝ‘์•„๋ณด๋ ค ํ–ˆ์ง€๋งŒ ๊ฒฐ๊ตญ ๋ฒ„๋ ธ๋‹ค. ์ž˜ ์•ˆ๋˜์—ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‹ค ๋งŒ๋‚œ๊ฒƒ์ด xtts๋‹ค. ๊ธฐ๋ณธ xtts๋งŒ์œผ๋กœ๋„ openVoice v2๋ณด๋‹ค ํ›จ์”ฌ ์ข‹์ง€๋งŒ finetuning์„ ํ•ด๋ณด๋‹ˆ ๋”์šฑ ๋” ๋น„์Šทํ•ด์กŒ๋‹ค.

๋ฐœ์Œ์ด ์กฐ๊ธˆ ๋ญ‰๊ฐœ์ง€๊ธด ํ•˜๋Š”๋ฐ ์ด๊ฑด ์–ด๋–ป๊ฒŒ ๋งž์ถฐ๋ณผ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค.

๋ˆ„๊ตฌ์˜ ๋ชฉ์†Œ๋ฆฌ๋กœ ๋“ค๋ฆฌ๋Š”๊ฐ€?

 

๊นƒํ—™์ฃผ์†Œ๋Š”

https://github.com/daswer123/xtts-webui

 

GitHub - daswer123/xtts-webui: Webui for using XTTS and for finetuning it

Webui for using XTTS and for finetuning it. Contribute to daswer123/xtts-webui development by creating an account on GitHub.

github.com

 

์ง€๊ธˆ์€ ๋„ˆ๋ฌด ์กธ๋ฆฌ๊ณ  ๋‹ค์Œ ํฌ์ŠคํŠธ์— ์„ค์น˜ ๋ฐ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๊ฒƒ์„ ์ •๋ฆฌํ•˜๊ฒ ๋‹ค.

https://github.com/myshell-ai/OpenVoice

 

GitHub - myshell-ai/OpenVoice: Instant voice cloning by MyShell.

Instant voice cloning by MyShell. Contribute to myshell-ai/OpenVoice development by creating an account on GitHub.

github.com

 

OpenVoice์— ๊ด€์‹ฌ์„ ๊ฐ€์ง€๊ฒŒ ๋œ ์ด์œ ๋Š”, ํ•˜๋‚˜๋‹ค.

"๋ชฉ์†Œ๋ฆฌ ๋ณต์ œ๋ฅผ ํ•˜๊ณ  ์‹ถ๋‹ค. ๋‚ด๊ฐ€ ์ข‹์•„ํ•˜๋Š” ๋ชฉ์†Œ๋ฆฌ๋กœ ํ•œ๊ธ€ ํ…์ŠคํŠธ๋ฅผ ๋ฌด์ œํ•œ์œผ๋กœ ์ฝ์–ด์คฌ์œผ๋ฉด ์ข‹๊ฒ ๋‹ค"

 

๊ธฐ์กด tts๋Š” ๊ธธ์ด์ œํ•œ์ด ์žˆ๊ณ  ์‚ฌ์šฉํ•˜๊ธฐ๋„ ๋ณต์žกํ–ˆ๊ธฐ์— ์ด๊ฒƒ์„ ๋„๋‹ค๋ฆฌ์ฒ˜๋Ÿผ ์•„์ฃผ ๊ฐ„๋‹จํ•˜๊ฒŒ

1. ๋Œ€์šฉ๋Ÿ‰ ํ…์ŠคํŠธ ํŒŒ์ผ ์ฒจ๋ถ€

2. ๋ชฉ์†Œ๋ฆฌ ์ƒ์„ฑ ๋ฒ„ํŠผ ํด๋ฆญ

3. wavํŒŒ์ผ ์ƒ์„ฑ

ํ•ด์ฃผ๋Š” ์•ฑ์„ ๋งŒ๋“ค๊ณ  ์‹ถ์—ˆ๋‹ค. ํ˜„์žฌ๊นŒ์ง€ r&d ๊ฒฐ๊ณผ๋กœ๋Š” ๋ถ€์ •์ ์ด์ง€๋งŒ, ์ด ๋ถ€์ •์  ๊ฒฐ๋ก ๊นŒ์ง€ ๋„๋‹ฌํ•œ ๊ณผ์ •์„ ๊ธฐ๋ก์œผ๋กœ ๋‚จ๊ธฐ๋ คํ•œ๋‹ค.

 

์ผ๋‹จ, V1์€ ๋‹ค๊ตญ์–ด ์ง€์›์ด ์•ˆ๋œ๋‹ค.

์˜์–ด์™€ ์ค‘๊ตญ์–ด๋งŒ ๋œ๋‹ค. ๋งŒ์•ฝ ํ•œ๊ตญ์–ด ๋ฅผ ์ฝ๊ฒŒ ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด ์ฝ์–ด์ค€๋‹ค. ๋„ค์ดํ‹ฐ๋ธŒ ๋ฏธ๊ตญ์ธ์ด ํ•œ๊ตญ์–ด ๋งํ•˜๋Š”๊ฑฐ ๊ฐ™๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š”! ์˜ค๋Š˜์€ ๋‚ ์”จ๊ฐ€ ์ •๋ง ์ข‹๋„ค์š”.

 

* ๊นƒํ—™์—์„œ ์„ค์น˜ํ•˜๊ธฐ

git clone https://github.com/myshell-ai/OpenVoice.git open_voice
cd open_voice

 

* ํ™˜๊ฒฝ๋งŒ๋“ค๊ธฐ - ํŒŒ์ด์ฌ ๋ฒ„์ „์ด ๊ผฌ์—ฌ์„œ conda๋ฅผ ์ด์šฉํ–ˆ๋‹ค.

conda create -n ov python=3.9
conda activate ov
pip install -r requirements.txt

 

* condaํ™˜๊ฒฝ์—์„œ ffmpeg๊ฐ€ ์—†๋‹ค๊ณ  ๋œฌ๋‹ค๋ฉด ์•„๋ž˜์ฒ˜๋Ÿผ ๊ผญ ffmpeg๋ฅผ ์„ค์น˜ํ•ด์ค˜์•ผ ํ•œ๋‹ค.

conda install ffmpeg

 

* cpu๋กœ ๋Œ๋ฆฌ๊ธฐ - cuda๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค๋ฉด pass!

- ํ˜„์žฌ open voice์—์„œ๋Š” cpu๋ฅผ ์‚ฌ์šฉํ• ์ˆ˜๊ฐ€ ์—†๋‹ค. ์ฝ”๋“œ๊ฐ€ ๋ˆ„๋ฝ๋˜์–ด์žˆ๊ณ  ์ด ๋ถ€๋ถ„์„ ์ˆ˜์ •ํ•ด์ฃผ๊ณ  ์žˆ์ง€ ์•Š์•„์„œ ์ง์ ‘ ์ˆ˜์ •ํ•ด์ค˜์•ผ ํ•œ๋‹ค.

๋จผ์ € se_extractor.py ํŒŒ์ผ๋กœ ๊ฐ„ํ›„ 22๋ฒˆ์งธ ์ค„์˜ ์•„๋ž˜ ์ฝ”๋“œ๋ฅผ

device = "cuda" if torch.cuda.is_available() else "cpu"
model = WhisperModel(model_size, device=device, compute_type="float16")

์•„๋ž˜์™€ ๊ฐ™์ด ๋ฐ”๊ฟ”์ค€๋‹ค. 

device, compute_type = ("cuda","float16") if torch.cuda.is_available() else ("cpu", "int8")
model = WhisperModel(model_size, device=device, compute_type=compute_type)

 

 

* ๋‹ค ๋๋‹ค. ์ด์ œ ๋Œ๋ ค๋ณด์ž. ํ•œ๊ตญ์–ด๋ฅผ ์ฝ์–ด๋ณด๊ฒŒ ํ–ˆ๋‹ค. ์˜์–ด๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด ์•„๋ž˜ ์ฃผ์„์„ ํ’€๋ฉด๋œ๋‹ค.

import os
import torch
from openvoice import se_extractor
from openvoice.api import BaseSpeakerTTS, ToneColorConverter

ckpt_base = 'checkpoints/base_speakers/EN'
ckpt_converter = 'checkpoints/converter'
device="cuda:0" if torch.cuda.is_available() else "cpu"
output_dir = 'outputs'

base_speaker_tts = BaseSpeakerTTS(f'{ckpt_base}/config.json', device=device)
base_speaker_tts.load_ckpt(f'{ckpt_base}/checkpoint.pth')

tone_color_converter = ToneColorConverter(f'{ckpt_converter}/config.json', device=device)
tone_color_converter.load_ckpt(f'{ckpt_converter}/checkpoint.pth')

os.makedirs(output_dir, exist_ok=True)

source_se = torch.load(f'{ckpt_base}/en_default_se.pth').to(device)


# reference_speaker = 'resources/example_reference.mp3' # This is the voice you want to clone
reference_speaker = 'resources/lympe.mp3' # This is the voice you want to clone

target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, target_dir='processed', vad=True)

# inference
save_path = f'{output_dir}/output_en_default.wav'

# Run the base speaker tts
# text = "This audio is generated by OpenVoice."
text = "์•ˆ๋…•ํ•˜์„ธ์š”! ์˜ค๋Š˜์€ ๋‚ ์”จ๊ฐ€ ์ •๋ง ์ข‹๋„ค์š”."

src_path = f'{output_dir}/tmp.wav'
base_speaker_tts.tts(text, src_path, speaker='default', language='English', speed=1.0)

# Run the tone color converter
encode_message = "@MyShell"
tone_color_converter.convert(
    audio_src_path=src_path, 
    src_se=source_se, 
    tgt_se=target_se, 
    output_path=save_path,
    message=encode_message)
    
source_se = torch.load(f'{ckpt_base}/en_style_se.pth').to(device)
save_path = f'{output_dir}/output_whispering.wav'

# Run the base speaker tts
# text = "This audio is generated by OpenVoice."
text = "์•ˆ๋…•ํ•˜์„ธ์š”! ์˜ค๋Š˜์€ ๋‚ ์”จ๊ฐ€ ์ •๋ง ์ข‹๋„ค์š”."

src_path = f'{output_dir}/tmp.wav'
base_speaker_tts.tts(text, src_path, speaker='whispering', language='English', speed=0.9)

# Run the tone color converter
encode_message = "@MyShell"
tone_color_converter.convert(
    audio_src_path=src_path, 
    src_se=source_se, 
    tgt_se=target_se, 
    output_path=save_path,
    message=encode_message)


ckpt_base = 'checkpoints/base_speakers/ZH'
base_speaker_tts = BaseSpeakerTTS(f'{ckpt_base}/config.json', device=device)
base_speaker_tts.load_ckpt(f'{ckpt_base}/checkpoint.pth')

source_se = torch.load(f'{ckpt_base}/zh_default_se.pth').to(device)
save_path = f'{output_dir}/output_chinese.wav'

# Run the base speaker tts
# text = "ไปŠๅคฉๅคฉๆฐ”็œŸๅฅฝ๏ผŒๆˆ‘ไปฌไธ€่ตทๅ‡บๅŽปๅƒ้ฅญๅงใ€‚"
text = "์•ˆ๋…•ํ•˜์„ธ์š”! ์˜ค๋Š˜์€ ๋‚ ์”จ๊ฐ€ ์ •๋ง ์ข‹๋„ค์š”."

src_path = f'{output_dir}/tmp.wav'
base_speaker_tts.tts(text, src_path, speaker='default', language='Chinese', speed=1.0)

# Run the tone color converter
encode_message = "@MyShell"
tone_color_converter.convert(
    audio_src_path=src_path, 
    src_se=source_se, 
    tgt_se=target_se, 
    output_path=save_path,
    message=encode_message)

 

 

* ๊ฒฐ๋ก 

V1์€ ์œˆ๋„์šฐ๋‚˜ ๋งฅ, ๋ชจ๋‘์—์„œ ์ž˜ ๋Œ์•„๊ฐ”๋‹ค. ์˜์–ด ์„ฑ๋Šฅ์€ v1๋„ ์ถฉ๋ถ„ํžˆ ์ข‹์•˜๋‹ค.
๋‹ค์Œ ํฌ์ŠคํŠธ์—์„œ ์ •๋ฆฌํ•  V2๋Š” ํ•œ๊ตญ์–ด์˜ ๊ฒฝ์šฐ cuda ํ™˜๊ฒฝ์—์„œ๋งŒ ๊ฐ€๋Šฅํ•˜๊ณ  - ์˜์–ด๋‚˜ ์ค‘๊ตญ์–ด๋Š” ์—ฌ์ „ํžˆ cpu์—์„œ ๋Œ์•„๊ฐ„๋‹ค - ์—ฌ๊ธฐ์— ๋ชฉ์†Œ๋ฆฌ ํŠธ๋ ˆ์ด๋‹๋„ ๊ฐ€๋Šฅํ•˜๋‹ค. ๋ฌผ๋ก  ์„ฑ๋Šฅ์€ ๊ทธ๋‹ค์ง€ ๋งŒ์กฑ์Šค๋Ÿฝ์ง€ ์•Š์ง€๋งŒ ์—ฌ๋Ÿฌ ํ…Œ์ŠคํŠธ๋ฅผ ํ•ด๋ณด๋‹ˆ ์–ด๋–ค ๋ชฉ์†Œ๋ฆฌ๋Š” ๊ฝค๋‚˜ ์ž˜ ๋ณต์ œํ•ด๋ƒˆ๋‹ค.

์ž์„ธํ•œ ์‚ฌํ•ญ์€ V2์— ๋‚จ๊ธฐ๊ฒ ๋‹ค.

* Whisper-WebUI ์„ค์น˜ ๋ฐ ์‹คํ–‰

- Whisper-webui๋Š” openAI์˜ whisper(https://github.com/openai/whisper)๋ฅผ ์“ฐ๊ธฐ ์‰ฝ๋„๋ก webUI๋กœ wrappingํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋‹ค.

- ๊ต‰์žฅํžˆ ํƒ์›”ํ•œ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•œ๋‹ค. ์ž๋ง‰์ด ์—†๋Š” ์˜ํ™”๋‚˜ ๋‚ด์šฉ์ด ๊ถ๊ธˆํ–ˆ๋˜ ์ผ๋ณธ ์•ผ๋™, ์™ธ๊ตญ ์œ ํŠœ๋ธŒ ์˜์ƒ๋“ค์˜ ์ž๋ง‰์„ ์˜์–ด๋‚˜ ์ผ์–ด๋กœ ๋จผ์ € ์ƒ์„ฑํ•œ ํ›„, ์ด๊ฑธ ๋‹ค์‹œ ํ•œ๊ตญ์–ด๋กœ ๋ฐ”๊พธ๋ฉด ๋œ๋‹ค - ๋ฐ”๋กœ ํ•œ๊ธ€๋กœ ์ƒ์„ฑ๋„ ๋˜์ง€๋งŒ ๋ช‡๋ฒˆํ•ด๋ณด๋‹ˆ ํ€„๋Ÿฌํ‹ฐ๊ฐ€ ๋–จ์–ด์กŒ๋‹ค.

- max_length๊ฐ€ 200์œผ๋กœ ์ œํ•œ๋˜์–ด ์žˆ์–ด์„œ ํ•œ๋ฒˆ์— ์žฅ๋ฌธ์˜ ๊ธ€์„ ๋ฒˆ์—ญํ•˜์ง€ ๋ชปํ•œ๋‹ค. ๋ฒˆ์—ญ ๋ชจ๋ธ์€ facebook/nllb-200์ด๋‹ค.

- ํ•œ๋ฒˆ์— ๋Œ€๋Ÿ‰์˜ ๋ฒˆ์—ญ์„ ํ•˜๊ธฐ์œ„ํ•ด์„  ๋”ฐ๋กœ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ๋งŒ๋“ค์–ด 200์ด๋‚ด๋กœ ๋ฌธ์žฅ์„ ๋Š์–ด์„œ ์ „๋‹ฌํ•ด์•ผ ํ•œ๋‹ค. ์ด๊ฑด ๋‹ค์Œ์— ๋งŒ๋“ค์–ด ๊ณต์œ ํ•  ๊ณ„ํš์ด๋‹ค.

 

1. ์ฝ”๋ฑ ์„ค์น˜

2. ์„ค์น˜

  • git clone https://github.com/jhj0517/Whisper-WebUI.git
  • ์•ˆ์— ๋“ค์–ด๊ฐ€์„œ ์œ„์—์„œ ์••์ถ• ํ‘ผ ์ฝ”๋ฑ ํด๋” ์ „์ฒด๋ฅผ ๋ถ™์—ฌ๋„ฃ๊ณ  ํ•ด๋‹น ๊ฒฝ๋กœ๋ฅผ path ๊ฑธ์–ด์ค€๋‹ค

3. install.bat์‹คํ–‰ - ํ•œ์ฐธ ๋ฐ›๋Š”๋‹ค
4. start-webui.bat ์‹คํ–‰

 

* ์‹คํ–‰

1. ๋จผ์ € Youtube ํƒญ์œผ๋กœ ๊ฐ€์„œ ์ฃผ์†Œ๋ฅผ ๋ถ™์—ฌ๋„ฃ๊ณ  generateํ•ด๋ณด์ž

2. ํ˜„์žฌ(23.06.26) ์•„๋ž˜์™€ ๊ฐ™์€ ์—๋Ÿฌ๊ฐ€ ๋œฌ๋‹ค.

pytube.exceptions.RegexMatchError: get_throttling_function_name: could not find match for multiple

3. ํ•ด๊ฒฐ์„ ์œ„ํ•ด ๊ตฌ๊ธ€๋ง์„ ํ•œ๋‹ค.

https://github.com/pytube/pytube/issues/1684

 

[BUG] FIX · Issue #1684 · pytube/pytube

fully working code pytube https://github.com/oncename/pytube/tree/master fix cipher.py function_patterns = [ # https://github.com/ytdl-org/youtube-dl/issues/29326#issuecomment-865985377 # https://g...

github.com

์ดํ‹€์ „์— ๋‚˜์˜จ ํ•ด๊ฒฐ์ฑ…์ด๋‹ค. ์œ ํŠœ๋ธŒ ํŒจํ„ด๋“ค์ด ์ž์ฃผ ๋ฐ”๋€Œ์–ด์„œ ์—…๋ฐ์ดํŠธ๊ฐ€ ๋ฏธ์ณ ๋ชป๋”ฐ๋ผ๊ฐ€๋Š” ๊ฒƒ ๊ฐ™๋‹ค.

function_patterns = [
    # https://github.com/ytdl-org/youtube-dl/issues/29326#issuecomment-865985377
    # https://github.com/yt-dlp/yt-dlp/commit/48416bc4a8f1d5ff07d5977659cb8ece7640dcd8
    # var Bpa = [iha];
    # ...
    # a.C && (b = a.get("n")) && (b = Bpa[0](b), a.set("n", b),
    # Bpa.length || iha("")) }};
    # In the above case, `iha` is the relevant function name
    r'a\.[a-zA-Z]\s*&&\s*\([a-z]\s*=\s*a\.get\("n"\)\)\s*&&.*?\|\|\s*([a-z]+)',
    r'\([a-z]\s*=\s*([a-zA-Z0-9$]+)(\[\d+\])?\([a-z]\)',
]

์•„๋ž˜์™€ ๊ฐ™์ด Whisper-WebUI\venv\Lib\site-packages\pytube ํด๋”๋กœ ์ด๋™ํ›„ cipher.py ํŒŒ์ผ์„ ์—ฐํ›„

๊ธฐ์กด์— ์žˆ๋˜ ์•„๋ž˜ ์ฝ”๋“œ๋ฅผ

r'a\.[a-zA-Z]\s*&&\s*\([a-z]\s*=\s*a\.get\("n"\)\)\s*&&\s*'

์ƒˆ๋กœ์šด ์•„๋ž˜ ์ฝ”๋“œ๋กœ ๋ฐ”๊ฟ” ๋ถ™์—ฌ๋„ฃ๊ธฐํ•˜์ž.

r'a\.[a-zA-Z]\s*&&\s*\([a-z]\s*=\s*a\.get\("n"\)\)\s*&&.*?\|\|\s*([a-z]+)',

์ด์ œ ์œ ํŠœ๋ธŒ ์ฃผ์†Œ๋ฅผ ๋‹ค์‹œ ๋„ฃ๊ณ  generateํ•˜๋ฉด ์˜ค๋ฅ˜์—†์ด ์ž˜๋œ๋‹ค.

 

'AI ์Œ์„ฑ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

xtts-webui๋กœ coqui ์„ค์น˜ํ•˜๊ธฐ  (0) 2024.07.01
coqui tts(xtts) v2 ์‚ฌ์šฉ๊ธฐ ์ •๋ฆฌ  (0) 2024.05.17
xtts๊ฐ€ ํ›จ์”ฌ ์ข‹๋‹ค  (1) 2024.05.11
Open Voice V1 ๋Œ๋ฆฌ๊ธฐ  (0) 2024.05.10

+ Recent posts