ํ•œ๋ฒˆ๋„ ๋„๋ฐ•์„ ํ•ด๋ณธ์  ์—†์œผ๋ฉด์„œ ์‹œ์ž‘ํ–ˆ๋˜ ํ•ด์™ธ์ถ•๊ตฌ ๋ฒ ํŒ… ์„œ๋น„์Šค. ๋ชจ๋“  ๊ฒƒ์ด ์„œํˆด๊ณ  ํž˜๋“ค์—ˆ๋‹ค. ์‚ฌ์—…์„ ์ง„ํ–‰ํ• ์ˆ˜๋ก ๊ณ„์† ์ด๋”๋ฆฌ์›€์„ ์žƒ์—ˆ๋‹ค. ์ด๋”๋ฆฌ์›€์„ ๋”ด ์‚ฌ๋žŒ๋“ค์€ ๊ณ„์† ๋ฐฉ๋ฌธํ–ˆ๊ณ  ์žƒ์€ ์‚ฌ๋žŒ๋“ค์€ ๋‹ค์‹œ ์˜ค์ง€ ์•Š์•˜๋‹ค.


๋ธ”๋ก์ฒด์ธ ํŒ์€ ์‚ฌ๊ธฐ๊พผ ์ฒœ์ง€์˜€๋‹ค. ์„œ๋ฒ„๋ฅผ ํ•ดํ‚น๋‹นํ•˜๊ณ  ํ™๋ณด๋ฅผ ๋ฏธ๋ผ๋กœ ์—ฌ๊ธฐ์ €๊ธฐ์„œ ์‚ฌ๊ธฐ๋ฅผ ๋‹นํ–ˆ๋‹ค. ๊ฒŒ๋‹ค๊ฐ€ ์ด๋”๋ฆฌ์›€ ๊ฐ€์Šค๋น„๋Š” ๋งค์ผ ์ถœ๋ ๊ฑฐ๋ ธ๋‹ค. ์–ด๋Š ๋‚ ์€ 30์›์ด์—ˆ๊ณ  ์–ด๋Š ๋‚ ์€ 1300์›์ด์—ˆ๋‹ค.

์„œ๋น„์Šค๋ฅผ ์ง„ํ–‰ํ–ˆ๋˜ 6๊ฐœ์›”๊ฐ„ ๊ฑฐ์˜ ์ž ์„ ์ž์ง€ ๋ชปํ–ˆ๋‹ค. ๋ถ€๋ชจ๋‹˜์€ ํ•ญ์ƒ ๊ฑฑ์ •ํ•˜์…จ๋‹ค. ๋ถ‰์€ ๋ˆˆ๊ณผ ๋Š˜์–ด์ง„ ๋‹คํฌ์„œํด์ด ๋„ˆ๋ฌด ์ง„ํ•˜๊ฒŒ ๊ฐ์ธ๋˜์–ด ์˜์›ํžˆ ์—†์–ด์งˆ ๊ฒƒ ๊ฐ™์ง€ ์•Š์•˜๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ฝ”๋กœ๋‚˜๋กœ ๋ชจ๋“  ์ถ•๊ตฌ ๋ฆฌ๊ทธ๊ฐ€ ์ค‘์ง€๋˜์—ˆ๋‹ค.

์„œ๋น„์Šค๋„ ์ค‘์ง€๋˜์—ˆ๋‹ค. ์ ์ž 1์–ต. ๊ทธ 1๋…„์ด ์•„๋งˆ๋„ ๋‚ด ์ธ์ƒ์—์„œ ๊ฐ€์žฅ ๋ฏฟ์„ ์ˆ˜ ์—†์„๋งŒํผ ์„ธ์ƒ์— ๋ฌด์–ธ๊ฐ€๋ฅผ ๋นŒ๋“œํ–ˆ๋˜ ์‹œ๊ธฐ์˜€๋‹ค. ์•„์‰ฌ์šด๊ฑด ๊ทธ๊ฒƒ์ด ๊ฒจ์šฐ ๋„๋ฐ• ์‚ฌ์ดํŠธ์˜€๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ๋‹ค์‹œ๋Š” ๊ทธ๋Ÿฐ๊ฒƒ์— ๋‚ด ์ธ์ƒ์„ ๊ฑธ์ง€ ์•Š์„ ๊ฒƒ์ด๋‹ค. ๊ทธ๋Ÿฐ ์˜๋ฏธ๋กœ ์ด ์„œ๋น„์Šค๋ฅผ ์™„์ „ ๊ณต๊ฐœํ•œ๋‹ค.

 

https://github.com/vEduardovich/whitebetting

 

GitHub - vEduardovich/whitebetting: https://wb.himion.com

https://wb.himion.com. Contribute to vEduardovich/whitebetting development by creating an account on GitHub.

github.com

* Whisper-WebUI ์„ค์น˜ ๋ฐ ์‹คํ–‰

- Whisper-webui๋Š” openAI์˜ whisper(https://github.com/openai/whisper)๋ฅผ ์“ฐ๊ธฐ ์‰ฝ๋„๋ก webUI๋กœ wrappingํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋‹ค.

- ๊ต‰์žฅํžˆ ํƒ์›”ํ•œ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•œ๋‹ค. ์ž๋ง‰์ด ์—†๋Š” ์˜ํ™”๋‚˜ ๋‚ด์šฉ์ด ๊ถ๊ธˆํ–ˆ๋˜ ์ผ๋ณธ ์•ผ๋™, ์™ธ๊ตญ ์œ ํŠœ๋ธŒ ์˜์ƒ๋“ค์˜ ์ž๋ง‰์„ ์˜์–ด๋‚˜ ์ผ์–ด๋กœ ๋จผ์ € ์ƒ์„ฑํ•œ ํ›„, ์ด๊ฑธ ๋‹ค์‹œ ํ•œ๊ตญ์–ด๋กœ ๋ฐ”๊พธ๋ฉด ๋œ๋‹ค - ๋ฐ”๋กœ ํ•œ๊ธ€๋กœ ์ƒ์„ฑ๋„ ๋˜์ง€๋งŒ ๋ช‡๋ฒˆํ•ด๋ณด๋‹ˆ ํ€„๋Ÿฌํ‹ฐ๊ฐ€ ๋–จ์–ด์กŒ๋‹ค.

- max_length๊ฐ€ 200์œผ๋กœ ์ œํ•œ๋˜์–ด ์žˆ์–ด์„œ ํ•œ๋ฒˆ์— ์žฅ๋ฌธ์˜ ๊ธ€์„ ๋ฒˆ์—ญํ•˜์ง€ ๋ชปํ•œ๋‹ค. ๋ฒˆ์—ญ ๋ชจ๋ธ์€ facebook/nllb-200์ด๋‹ค.

- ํ•œ๋ฒˆ์— ๋Œ€๋Ÿ‰์˜ ๋ฒˆ์—ญ์„ ํ•˜๊ธฐ์œ„ํ•ด์„  ๋”ฐ๋กœ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ๋งŒ๋“ค์–ด 200์ด๋‚ด๋กœ ๋ฌธ์žฅ์„ ๋Š์–ด์„œ ์ „๋‹ฌํ•ด์•ผ ํ•œ๋‹ค. ์ด๊ฑด ๋‹ค์Œ์— ๋งŒ๋“ค์–ด ๊ณต์œ ํ•  ๊ณ„ํš์ด๋‹ค.

 

1. ์ฝ”๋ฑ ์„ค์น˜

2. ์„ค์น˜

  • git clone https://github.com/jhj0517/Whisper-WebUI.git
  • ์•ˆ์— ๋“ค์–ด๊ฐ€์„œ ์œ„์—์„œ ์••์ถ• ํ‘ผ ์ฝ”๋ฑ ํด๋” ์ „์ฒด๋ฅผ ๋ถ™์—ฌ๋„ฃ๊ณ  ํ•ด๋‹น ๊ฒฝ๋กœ๋ฅผ path ๊ฑธ์–ด์ค€๋‹ค

3. install.bat์‹คํ–‰ - ํ•œ์ฐธ ๋ฐ›๋Š”๋‹ค
4. start-webui.bat ์‹คํ–‰

 

* ์‹คํ–‰

1. ๋จผ์ € Youtube ํƒญ์œผ๋กœ ๊ฐ€์„œ ์ฃผ์†Œ๋ฅผ ๋ถ™์—ฌ๋„ฃ๊ณ  generateํ•ด๋ณด์ž

2. ํ˜„์žฌ(23.06.26) ์•„๋ž˜์™€ ๊ฐ™์€ ์—๋Ÿฌ๊ฐ€ ๋œฌ๋‹ค.

pytube.exceptions.RegexMatchError: get_throttling_function_name: could not find match for multiple

3. ํ•ด๊ฒฐ์„ ์œ„ํ•ด ๊ตฌ๊ธ€๋ง์„ ํ•œ๋‹ค.

https://github.com/pytube/pytube/issues/1684

 

[BUG] FIX · Issue #1684 · pytube/pytube

fully working code pytube https://github.com/oncename/pytube/tree/master fix cipher.py function_patterns = [ # https://github.com/ytdl-org/youtube-dl/issues/29326#issuecomment-865985377 # https://g...

github.com

์ดํ‹€์ „์— ๋‚˜์˜จ ํ•ด๊ฒฐ์ฑ…์ด๋‹ค. ์œ ํŠœ๋ธŒ ํŒจํ„ด๋“ค์ด ์ž์ฃผ ๋ฐ”๋€Œ์–ด์„œ ์—…๋ฐ์ดํŠธ๊ฐ€ ๋ฏธ์ณ ๋ชป๋”ฐ๋ผ๊ฐ€๋Š” ๊ฒƒ ๊ฐ™๋‹ค.

function_patterns = [
    # https://github.com/ytdl-org/youtube-dl/issues/29326#issuecomment-865985377
    # https://github.com/yt-dlp/yt-dlp/commit/48416bc4a8f1d5ff07d5977659cb8ece7640dcd8
    # var Bpa = [iha];
    # ...
    # a.C && (b = a.get("n")) && (b = Bpa[0](b), a.set("n", b),
    # Bpa.length || iha("")) }};
    # In the above case, `iha` is the relevant function name
    r'a\.[a-zA-Z]\s*&&\s*\([a-z]\s*=\s*a\.get\("n"\)\)\s*&&.*?\|\|\s*([a-z]+)',
    r'\([a-z]\s*=\s*([a-zA-Z0-9$]+)(\[\d+\])?\([a-z]\)',
]

์•„๋ž˜์™€ ๊ฐ™์ด Whisper-WebUI\venv\Lib\site-packages\pytube ํด๋”๋กœ ์ด๋™ํ›„ cipher.py ํŒŒ์ผ์„ ์—ฐํ›„

๊ธฐ์กด์— ์žˆ๋˜ ์•„๋ž˜ ์ฝ”๋“œ๋ฅผ

r'a\.[a-zA-Z]\s*&&\s*\([a-z]\s*=\s*a\.get\("n"\)\)\s*&&\s*'

์ƒˆ๋กœ์šด ์•„๋ž˜ ์ฝ”๋“œ๋กœ ๋ฐ”๊ฟ” ๋ถ™์—ฌ๋„ฃ๊ธฐํ•˜์ž.

r'a\.[a-zA-Z]\s*&&\s*\([a-z]\s*=\s*a\.get\("n"\)\)\s*&&.*?\|\|\s*([a-z]+)',

์ด์ œ ์œ ํŠœ๋ธŒ ์ฃผ์†Œ๋ฅผ ๋‹ค์‹œ ๋„ฃ๊ณ  generateํ•˜๋ฉด ์˜ค๋ฅ˜์—†์ด ์ž˜๋œ๋‹ค.

 

'AI ์Œ์„ฑ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

xtts-webui๋กœ coqui ์„ค์น˜ํ•˜๊ธฐ  (0) 2024.07.01
coqui tts(xtts) v2 ์‚ฌ์šฉ๊ธฐ ์ •๋ฆฌ  (0) 2024.05.17
xtts๊ฐ€ ํ›จ์”ฌ ์ข‹๋‹ค  (1) 2024.05.11
Open Voice V1 ๋Œ๋ฆฌ๊ธฐ  (0) 2024.05.10

๋””์ž์ด๋„ˆ๊ฐ€ ์•„๋‹Œ ์‚ฌ๋žŒ์—๊ฒŒ bi๋‚˜ ci๋ฅผ ๋งŒ๋“œ๋Š” ์ผ์€ ์–ธ์ œ๋‚˜ ๊ณค์š•์ด๋‹ค.

์–ด์ฉŒ๋ฉด ๋””์ž์ด๋„ˆ์—๊ฒŒ๋„ ์‰ฌ์šด์ผ์€ ์•„๋‹๊ฒƒ์ด๋‹ค.

 

๊ฐ€์žฅ ํฐ ๋ฌธ์ œ๋Š” ๋‚˜์กฐ์ฐจ ๋‚ด๊ฐ€ ์›ํ•˜๋Š”๊ฒŒ ๋ฌด์—‡์ธ์ง€ ๋ชจ๋ฅธ๋‹ค๋Š” ๊ฒƒ์ธ๋ฐ ์ด ๋ฌธ์ œ๋Š” ์Šคํ…Œ์ด๋ธ” ๋””ํ“จ์ „์œผ๋กœ ๊น”๋”ํ•˜๊ฒŒ ํ•ด๊ฒฐ๋œ๋‹ค. ์ปจ์…‰์„ ์ฃผ๊ณ  ์ฒดํฌํฌ์ธํŠธ์™€ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋ฐ”๊ฟ”๊ฐ€๋ฉฐ 300๊ฐœ ์ •๋„ ๋ฝ‘์•„๋ณด๋‹ˆ ๋งˆ์Œ์— ๋“œ๋Š” ๊ฒƒ๋งŒ ์ˆ˜์‹ญ๊ฐœ๊ฐ€ ๋‚˜์˜จ๋‹ค.

 

lora ๋Š” ์ฝ”๋žฉ์œผ๋กœ ๋งŒ๋“ค์–ด๋„ ์ข‹๊ณ  ๋‚˜์ฒ˜๋Ÿผ kohya_gui๋กœ ๋งŒ๋“ค์–ด๋„ ์ข‹๋‹ค.

https://github.com/bmaltais/kohya_ss

 

1. kohya๋ฅผ ์„ค์น˜ํ•œ ํ›„

2. https://blog.himion.com/175 ๋‚˜ https://blog.himion.com/176 ์˜ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ด์šฉํ•ด ๋„ค์ด๋ฒ„์™€ ๊ตฌ๊ธ€์—์„œ bi ์ด๋ฏธ์ง€๋“ค์„ ์ˆ˜์ง‘ํ•œ๋‹ค. 

3. ๊ทธ ์ค‘ ํ€„๋Ÿฌํ‹ฐ๊ฐ€ ๊ดœ์ฐฎ์€ ๊ฒƒ๋“ค๋งŒ ์ถ”๋ฆฐ๋‹ค. ๋‚˜๋Š” 100๊ฐœ ์ •๋„๋งŒ ์‚ฌ์šฉํ–ˆ๋‹ค.

4. kohya์˜ Utilies > Captioning > BLIP Captioning - ๊ตฌ๊ธ€์˜ ๋น„์ „์„ ์‚ฌ์šฉํ•œ๋‹ค. ์ด๋ฏธ์ง€์— ์บก์…˜์„ ๋ชจ๋‘ ๋‹ฌ๊ณ  kohya ๋‚ด๋ถ€์— ํด๋”๋ฅผ ํ•˜๋‚˜ ๋งŒ๋“ค์–ด ๋„ฃ๋Š”๋‹ค.

5. kohya์˜ Dreambooth LoRA๋ฅผ ์ด์šฉํ•ด ๋กœ๋ผ๋ฅผ ๋งŒ๋“ ๋‹ค. ๋‚˜๋Š” 4090 ๋•๋ถ„์— batch size๋ฅผ 8๋กœ ํ‚ค์šฐ๊ณ  Epoch์„ 1๋กœ ์ค„์—ฌ๋„ ํ€„๋ฆฌ ์ข‹๊ฒŒ ๋‚˜์™”๋‹ค. ์•ฝ 100์žฅ์œผ๋กœ bi ๋กœ๋ผ๋ฅผ ๋งŒ๋“œ๋Š”๋ฐ 1์‹œ๊ฐ„ 15๋ถ„์ด ๊ฑธ๋ ธ๋‹ค.

6. ํ”„๋กฌํ”„ํŠธ๋ฅผ ์•„๋ž˜์™€ ๊ฐ™์ด ๋งŒ๋“  ํ›„ 300๋ฐฑ์žฅ ์ •๋„ ๋ฝ‘์•˜๋‹ค.

Model: 2dn_1, Version: v1.2.1
positive
<lora:brand_identity:1>, logo, bi, ci, brand, text, "Wendy", beyond the world, dimly ufo of ghost, (moon:0.3), sky, <lora:weird:0.4>
Negative prompt: easynegative, ng_deepnegative_v1_75t, ((worst quality)), ((low quality)), easynegative,
Steps: 20, Sampler: Euler a, CFG scale: 4.5, Seed: 366227024, Size: 512x512, 

> ๊ฒฐ๋ก 

1. ๋‚ด๊ฐ€ ์ƒ์ƒ๋„ ๋ชปํ–ˆ๋˜ ๋‹ค์–‘ํ•œ ๋ถ„์œ„๊ธฐ์˜ bi๋“ค์ด ๋ฝ‘์ธ๋‹ค.

2. ์ฒดํฌํฌ์ธํŠธ์— ๋”ฐ๋ผ ๋ถ„์œ„๊ธฐ๊ฐ€ ๋”์šฑ ๋‹ค์–‘ํ•ด์ง„๋‹ค.

3. Sampler๋Š” ์ฒ˜๋ฆฌ๊ฐ€ ๋‹จ์ˆœํ•œ๊ฒŒ ์ข‹๋‹ค. ์‹ค์‚ฌ ์ด๋ฏธ์ง€๋ฅผ ๋ฝ‘๋Š”๊ฒŒ ์•„๋‹ˆ๋‹ˆ๊นŒ.

4. ํ…์ŠคํŠธ๋Š” ์ •ํ™•ํ•œ ์‚ฝ์ž…์ด ์–ด๋ ต๋‹ค. ํ…์ŠคํŠธ ์ค‘์‹ฌ ๋กœ๊ณ ๋Š” ์ข€ ๋” ๊ณ ๋ฏผํ•ด์•ผ ํ•  ๊ฒƒ ๊ฐ™๋‹ค.

5. ๊ธฐ๋ณธ์ ์ธ ํฌํ† ์ƒต, ์ผ๋Ÿฌ๊ฐ€ ๊ฐ€๋Šฅํ•œ ์‚ฌ๋žŒ์€ ์ฐ์–ด๋‚ด๋“ฏ ๋งŒ๋“ค์–ด๋‚ผ์ˆ˜๋„ ์žˆ๊ฒ ๋‹ค. ํ™˜์ƒ์ ์ด๋‹ค.

11์‹œ์— ์ผ์–ด๋‚ฌ๋‹ค.

๊ณ„์† ์ค‘์–ผ๋Œ€๊ณ  ์žˆ์—ˆ๋‹ค.

์„ฑ๊ณตํ•˜๋Š” ์ฐฝ์—…๊ฐ€๋ผ๋ฉด ์–ด๋–ป๊ฒŒ ํ–‰๋™ํ–ˆ์„์ง€์— ์ƒ๊ฐ์ด ๋ฏธ์ณค๋‹ค.

 

๋„ˆ๋ฌด ๋Šฆ๋‹ค. ๋Šฆ์–ด๋„ ๋„ˆ๋ฌด ๋Šฆ๋‹ค.

์–ด์ฉŒ๋ฉด ๋ชจ๋‘๊ฐ€ ๊ทธ๋Ÿฐ ๋‚Œ์ƒˆ๋ฅผ ๋ˆˆ์น˜์ฑ˜๋Š”์ง€ ๋ชจ๋ฅธ๋‹ค.

๊ทธ๋ž˜์„œ ๋‹ค๋“ค ๊ทธ๋žฌ๋Š”์ง€ ๋ชจ๋ฅธ๋‹ค..(๋ถ„๋ช… ๊ทธ๋žฌ์„ ๊ฒƒ์ด๋‹ค)

 

์‹ฑํฌ๋Œ€์— ์Œ“์ธ ์‹๊ธฐ๋“ค์„ ๋‹ฆ์œผ๋ฉฐ

๊นจ๋—ํžˆ ์ค€๋น„๊ฐ€ ์™„๋ฃŒ๋˜์—ˆ๋‹ค๋Š” ๊ฑธ ์•Œ์•˜๋‹ค. ๋‚˜๋Š” ๋Šฆ์„ ์ด์œ ๊ฐ€ ์—†์—ˆ๋‹ค.

์ž๊พธ ๋Šฆ์œผ๋‹ˆ๊นŒ ๋‚ด ์‹ฌ์žฅ์กฐ์ฐจ ๋‘๊ทผ๊ฑฐ๋ฆฌ์ง€ ์•Š๋Š” ๊ฒƒ์ด์—ˆ๋‹ค.

 

์„ฑ๊ณตํ•˜๋Š” ์ฐฝ์—…๊ฐ€๋ผ๋ฉด ์–ด๋–ป๊ฒŒ ํ–‰๋™ํ–ˆ์„์ง€ ์ƒ๊ฐํ–ˆ๋‹ค.

๊ทธ๋“ค์€ ์ƒˆ๋ฒฝ 5์‹œ์— ์žค๋‹ค๋Š” ์ด์œ ๋ฅผ ํ•‘๊ณ„๋กœ ๋Œ€์ง€ ์•Š์•˜์„ ๊ฒƒ์ด๋‹ค.

์„ฑ๊ณตํ•œ ์ ์ด ์—†์–ด์„œ ํ™•์‹คํ•˜์ง„ ์•Š์ง€๋งŒ ์ด๋Œ€๋กœ ์„ฑ๊ณตํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๊ฑด ์„ฑ๊ณตํ•œ ์ ์ด ์—†๋Š” ๋‚˜๋„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

 

๋Šฆ์Œ์€ ๋‚ฉ๋“ ๊ฐ€๋Šฅํ•˜๋‹ค.

ํ•˜์ง€๋งŒ ๋‚ด ๊ฟˆ์€ ๋‚ฉ๋“ ๊ฐ€๋Šฅํ•œ ๊ฒƒ์ด ์•„๋‹ˆ๋‹ค.

๋ง์ด ๋˜์ง€ ์•Š๋Š” ๊ฒƒ์„ ๋ง์ด ๋˜๊ฒŒ ์ด๋ค„๋‚ผ ์ˆ˜๋Š” ์—†๋‹ค.

 

์†๋„๋ฅผ ๋†’์—ฌ์•ผ ํ•œ๋‹ค.

* ๋ธŒ๋ผ์šฐ์ €๋ฅผ ๋„์šฐ์ง€ ์•Š๊ณ  ๋ฉ”๋ชจ๋ฆฌ์—๋งŒ selenium์„ ๋„์›Œ ํฌ๋กค๋งํ•˜๊ธฐ

  1. ๊ธฐ์กด ์ฝ”๋“œ( https://blog.himion.com/176 )์™€ ๊ฑฐ์˜ ์œ ์‚ฌํ•˜๋‹ค.
  2. headlessDriver() ํ•จ์ˆ˜๊ฐ€ ์ถ”๊ฐ€๋˜์—ˆ๋‹ค
'''
* ๊ตฌ๊ธ€ ์ด๋ฏธ์ง€ ๊ฐ€์ ธ์˜ค๊ธฐ ver. Headless
'''

import os
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import undetected_chromedriver as uc

import urllib
import time, datetime

ITEM_LIST = [ "Keith Thompson", "Zdzislaw Beksinski", "dariusz zawadzki"] # 1๋ฒˆ
FOLDER = 'google' # 2๋ฒˆ
IMG_XPATH = '//*[@id="Sva75c"]/div[2]/div/div[2]/div[2]/div[2]/c-wiz/div/div/div/div[3]/div[1]/a/img[1]' # 3๋ฒˆ
SIGNINURL = 'https://accounts.google.com/signin/v2/identifier?hl=ko&passive=true&continue=https%3A%2F%2Fwww.google.com%2F&ec=GAZAmgQ&flowName=GlifWebSignIn&flowEntry=ServiceLogin'
ID = 'xxxx@gmail.com' # 4๋ฒˆ
PASSWORD = 'xxxx' # 5๋ฒˆ

def main():
  start = check_start() # ์‹œ๊ฐ„ ์ธก์ • ์‹œ์ž‘
  driver = headlessDriver()# headless๋ฅผ ์ ์šฉํ•˜๊ณ  ์‹ถ์„๋•Œ
  driver.get(SIGNINURL)
  googleSignIn(driver)# ๊ตฌ๊ธ€๋กœ๊ทธ์ธํ•˜๊ณ 
  
  for searchItem in ITEM_LIST:
    saveDir = makeFolder(searchItem)
    
    url = makeUrl(searchItem)# ๊ฒ€์ƒ‰ํ•  url ๊ฐ€์ ธ์™€์„œ
    driver.get(url)# ์ด๋ฏธ์ง€ ๊ฒ€์ƒ‰์œผ๋กœ ๊ฐ€์„œ
    maximizeWindow(driver)# ์ฐฝ์ตœ๋Œ€ํ™”
    scrollToEnd(driver)

    forbiddenCount = saveImgs(driver, saveDir, start)# ๋ชจ๋“  ์ƒ์„ธ ์ด๋ฏธ์ง€ src๋“ค์„ ๊ฐ€์ ธ์˜จ๋‹ค
    sec = check_time(start)
    print(f'์‹คํŒจ์ˆ˜{str(forbiddenCount)}, {sec}, {datetime.datetime.now().time()}')
  time.sleep(10)
  driver.quit() 
  
def headlessDriver():
  options = uc.ChromeOptions()
  options.headless=True
  options.add_argument('--headless=new')
  driver = uc.Chrome(options=options)
  return driver

# ๊ตฌ๊ธ€ ๋กœ๊ทธ์ธ
def googleSignIn(driver):
  idBtn = driver.find_element(By.XPATH,'//*[@id="identifierId"]')# id ์ž…๋ ฅ์นธ
  idBtn.send_keys(ID)
  nextBtn = driver.find_element(By.XPATH,'//*[@id="identifierNext"]/div/button')
  nextBtn.click()# ๋‹ค์Œ ๋ฒ„ํŠผ ํด๋ฆญ

  # ์•„๋ž˜ ์ฝ”๋“œ๋Š” ๋น„๋ฐ€๋ฒˆํ˜ธ ์š”์†Œ๊ฐ€ ํ™”๋ฉด์— ๋‚˜ํƒ€๋‚ ๋•Œ๊ฐ€์ง€ 10์ดˆ๊ฐ„ ๊ธฐ๋‹ค๋ฆฌ๋Š” ์ฝ”๋“œ์ด๋‚˜
  # ๋น„๋ฒˆ์˜ ๊ฒฝ์šฐ not interactive elem๋ผ์„œ ์—๋Ÿฌ๊ฐ€ ๋œฌ๋‹ค. ํ•˜์ง€๋งŒ ๋Œ์•„๊ฐ€๋Š” ์ฝ”๋“œ์ด๋‹ˆ ๊ธฐ๋‹ค๋ฆผ์ด ํ•„์š”ํ• ๋•Œ ์“ฐ์ž.
  try:
    passwordBtn = WebDriverWait(driver, timeout=10).until(EC.presence_of_element_located( (By.XPATH,'//*[@id="password"]/div[1]/div/div[1]/input') ))
    time.sleep(4)
    passwordBtn = driver.find_element(By.XPATH,'//*[@id="password"]/div[1]/div/div[1]/input')# ๋น„๋ฐ€๋ฒˆํ˜ธ ์ž…๋ ฅ์นธ
    passwordBtn.send_keys(PASSWORD)
    passwordNextBtn = driver.find_element(By.XPATH,'//*[@id="passwordNext"]/div/button')
    passwordNextBtn.click()# ๋น„๋ฐ€๋ฒˆํ˜ธ ๋‹ค์Œ ๋ฒ„ํŠผ
    print('๊ตฌ๊ธ€ ๋กœ๊ทธ์ธ ์„ฑ๊ณต')
    # driver.implicitly_wait(10)
  except OSError as e:
    print(e)
    
  time.sleep(20)# ํœด๋Œ€ํฐ ๋ณธ์ธ ์ธ์ฆ๋“ฑ์˜ ์‹œ๊ฐ„์ด ์ถฉ๋ถ„ํžˆ ํ•„์š”ํ•˜๋‹ค


# ๊ตฌ๊ธ€ ์ด๋ฏธ์ง€ ๊ฒ€์ƒ‰ url ๋งŒ๋“ค๊ธฐ
def makeUrl(searchItem):
  url = 'https://www.google.com/search'
  params ={# q์™€ tbm์ด ํ•„์ˆ˜
    'q'     : searchItem,
    'tbm'   : 'isch',
  }
  url = url + '?' + urllib.parse.urlencode(params)
  return url


# ํด๋” ์ƒ์„ฑ
def makeFolder(searchItem):
  saveDir = os.path.join(os.getcwd(), 'data', f'{FOLDER}_{searchItem}')
  try:
    if not(os.path.isdir(saveDir)): # ํ•ด๋‹น ํด๋”๊ฐ€ ์—†๋‹ค๋ฉด
      os.makedirs(os.path.join(saveDir)) # ๋งŒ๋“ค์–ด๋ผ
    return saveDir
  except OSError as e:
    print(e+'ํด๋” ์ƒ์„ฑ ์‹คํŒจ')

# ์ฐฝ ์ตœ๋Œ€ํ™”
def maximizeWindow(driver):
  driver.maximize_window()

# ๋ชจ๋“  ์ด๋ฏธ์ง€ ๋ชฉ๋ก์„ ๊ฐ€์ ธ์˜ค๊ธฐ ์œ„ํ•ด ๋ฌดํ•œ ์Šคํฌ๋กค ๋‹ค์šด
def scrollToEnd(driver):
  prev_height = driver.execute_script('return document.body.scrollHeight')
  print(f'prev_height: {prev_height}')
  
  while True:
    time.sleep(1) #๋„ค์ด๋ฒ„๋Š” sleep์—†์ด ์ด๋™ํ•  ๊ฒฝ์šฐ ๋ฌดํ•œ๋กœ๋”ฉ์— ๊ฑธ๋ฆฐ๋‹ค.
    driver.execute_script('window.scrollTo(0, document.body.scrollHeight)')
    time.sleep(3)
    
    cur_height = driver.execute_script('return document.body.scrollHeight')
    print(f'cur_height: {cur_height}')
    if cur_height == prev_height:
      print('๋†’์ด๊ฐ€ ๊ฐ™์•„์ง')
      break
    prev_height = cur_height
  # ํŽ˜์ด์ง€๋ฅผ ๋ชจ๋‘ ๋กœ๋”ฉํ•œ ํ›„์—๋Š” ์ตœ์ƒ๋‹จ์œผ๋กœ ๋‹ค์‹œ ์˜ฌ๋ผ๊ฐ€๊ธฐ
  driver.execute_script('window.scrollTo(0, 0)')

# ๋ชจ๋“  ์ด๋ฏธ์ง€๋“ค์„ ์ €์žฅํ•œ๋‹ค
def saveImgs(driver, saveDir, start):
  time.sleep(1)
  forbiddenCount = 0
  imgs = driver.find_elements(By.CSS_SELECTOR, '.rg_i.Q4LuWd')
  img_count = len(imgs)
  print(f'์ „์ฒด ์ด๋ฏธ์ง€์ˆ˜ : {img_count}')
  # ํ•˜๋‚˜์”ฉ ํด๋ฆญํ•ด๊ฐ€๋ฉฐ ์ €์žฅ
  for imgNum, img in enumerate(imgs): # imgNum์— ์ด๋ฏธ์ง€๋ฒˆํ˜ธ๊ฐ€ 0๋ถ€ํ„ฐ ๋“ค์–ด๊ฐ„๋‹ค
    try:
      img.click()
      time.sleep(3)
      
      # ์•„๋ž˜์˜ xPath๋Š” ์ž์ฃผ ๋ฐ”๋€Œ๋Š” ๊ฒƒ ๊ฐ™๋‹ค. ๋‚˜๋จธ์ง€๋Š” ๊ณ ์ •์ธ๊ฑฐ ๊ฐ™์œผ๋‹ˆ ์ด๊ฒƒ๋งŒ ๊ฐ€๋” ํ™•์ธํ•ด์ฃผ์ž
      bigImg = driver.find_element(By.XPATH, IMG_XPATH)
      src = bigImg.get_attribute('src')
      urllib.request.urlretrieve(src, saveDir + '/' + str(imgNum) + '.jpg')
      sec = check_time(start)
      print(f'{imgNum+1}/{img_count} saved {sec}')

    except Exception as e:
      print(e)
      forbiddenCount += 1# ์ €์žฅ ์‹คํŒจํ•œ ๊ฐœ์ˆ˜. forbidden์ด๋‚˜ ํŒŒ์ผ์—๋Ÿฌ๋„ ๊ฝค ๋งŽ๋‹ค
      continue
  return forbiddenCount


# ์‹œ๊ฐ„ ์ธก์ •
def check_start():
    start_time = time.time()
    print("Start! now.." + str(start_time))
    return start_time
def check_time(start):
    end = time.time()
    during = end - start
    sec = str(datetime.timedelta(seconds=during)).split('.')[0]
    return sec
main()

+ Recent posts