Caution
This library is unmaintained and replaced with hxsoup.
Various convenient features related to requests and BeautifulSoup. (requests + BeautifulSoup)
-
requests
๋ผ์ด๋ธ๋ฌ๋ฆฌ์ BeatifulSoup๋ฅผ ํฉ์ณ ๋ช ์ค์ ์ฝ๋๋ฅผ ํ๋์ ํฉ์น ์ ์์ผ๋ฉฐ, - ๊ฐ๋จํ๊ฒ async, cache๋ฅผ ๋ถ๋ฌ์ ์ฌ์ฉํ ์ ์์ต๋๋ค.
- ์น ์คํฌ๋ํ ์ ํธ๋ฆฌํ ๊ธฐ๋ณธ๊ฐ๋ ์ค๋น๋์ด ์๊ณ ,
-
no_empty_result
,attempts
,avoid_sslerror
๋ฑ ๋ค์ํ๊ณ ์์ํ ๊ธฐ๋ฅ๋ ์ค๋น๋์ด ์์ต๋๋ค.
์์ํ์ง๋ง ์ ์ฉํ๋ฉฐ, ์๋ ์ค์ ์ฝ๋ ์์ฑ๋์ ์ค์ฌ์ฃผ๋ ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ๋๋ค.
-
ํ์ด์ฌ์ ์ค์นํฉ๋๋ค.
-
ํฐ๋ฏธ๋์์ ๋ค์๊ณผ ๊ฐ์ ๋ช ๋ น์ด๋ฅผ ์คํํฉ๋๋ค.
pip install -U resoup
requests์ bs4๋ ๊ฐ์ด ์ค์น๋์ง๋ง BeatifulSoup์ ์ถ๊ฐ์ ์ธ parser์ธ lxml์ html5lib๋ ๊ธฐ๋ณธ์ผ๋ก ์ ๊ณตํ์ง ์์ต๋๋ค.
๋ฐ๋ผ์ lxml, html5lib ๋ฑ์ ์ค์ค๋ก ์ค์นํ์
์ผ ์ค๋ฅ๊ฐ ๋์ง ์์ ์ ์์ต๋๋ค.
๋ง์ฝ ์ค์น๋์ง ์์ ์ํ๋ก ํด๋น parser๋ฅผ ์ด์ฉํ๋ค๋ฉด NoParserError
๊ฐ ๋ฉ๋๋ค.
์ฐธ๊ณ : ์์๋ค์ ๊ฒฝ์ฐ ๋ง์ ๊ฒฝ์ฐ get
์์ฒญ์ ์์ฃผ๋ก ์ค๋ช
ํ์ง๋ง, ๋ค๋ฅธ ๋ชจ๋ ๋ฉ์๋(options/head/post/put/patch/delete)์์๋ ๋์ผํ๊ฒ ์๋ํฉ๋๋ค.
resoup.requests
๋ชจ๋์ ๋ค์๊ณผ ๊ฐ์ด importํด ์ฌ์ฉํ ์ ์์ต๋๋ค.
from resoup import requests # `import requests`์ ํธํ๋จ.
์ด ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ requests ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ 99% ํธํ๋๋ฉฐ (์ฌ์ง์ด ํ์
ํํธ๋ requests ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ๋๊ฐ์ด ์ ์๋ํฉ๋๋ค!), ๊ทธ ์์ ํธ๋ฆฌํ ๊ธฐ๋ฅ์ ์น์ ํํ์
๋๋ค. ์ฆ, ๊ธฐ์กด import requests
๋ฅผ ์์ ์ฝ๋๋ก ๊ต์ฒดํ๋ฉด ๊ธฐ์กด์ ์ฝ๋๋ฅผ ๋ง๊ฐ๋จ๋ฆฌ์ง ์์ผ๋ฉด์๋ ์ ํตํฉํ ์ ์์ต๋๋ค.
requests์ Session๋ ๋น์ทํ๊ฒ ์ฌ์ฉํ ์ ์์ต๋๋ค.
from resoup import requests
with requests.Session() as session:
... # cget, attempts ๋ฑ ๋ชจ๋ ๊ธฐ๋ฅ ์ฌ์ฉ ๊ฐ๋ฅ
๊ธฐ๋ณธ๊ฐ๋ค์ ๊ฐ๊ฐ ์ ๋นํ ๊ฐ์ผ๋ก ์ค์ ๋์ด ์์ต๋๋ค.
๊ธฐ๋ณธ๊ฐ๋ค์ ๋ค์๊ณผ ๊ฐ๊ณ request.get/options/head/post/put/patch/delete์์ ์ ์ฉ๋ฉ๋๋ค.
timeout ๊ธฐ๋ณธ๊ฐ: 120
headers ๊ธฐ๋ณธ๊ฐ: {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "ko-KR,ko;q=0.9",
"Sec-Ch-Ua": '"Chromium";v="116", "Not)A;Brand";v="24", "Google Chrome";v="116"',
"Sec-Ch-Ua-Mobile": "?0",
"Sec-Ch-Ua-Platform": '"Windows"',
"Sec-Fetch-Dest": "document",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Site": "none",
"Sec-Fetch-User": "?1",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36",
}
attempts ๊ธฐ๋ณธ๊ฐ: 1
avoid_sslerror ๊ธฐ๋ณธ๊ฐ: False
>>> from resoup import requests
>>>
>>> from resoup import requests
>>> res = requests.get("https://httpbin.org/headers")
>>> res.json()['headers']
{'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'ko-KR,ko;q=0.9',
'Host': 'httpbin.org',
'Sec-Ch-Ua': '"Chromium";v="116", "Not)A;Brand";v="24", "Google Chrome";v="116"',
'Sec-Ch-Ua-Mobile': '?0',
'Sec-Ch-Ua-Platform': '"Windows"',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-User': '?1',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36',
'X-Amzn-Trace-Id': ...}
resoup.requests
๋ชจ๋์ get/options/head/post/put/patch/delete ํจ์๋ ๋ชจ๋ ResponseProxy๋ฅผ ๋ฆฌํดํฉ๋๋ค.
ResponseProxy๋ ๊ธฐ์กด Response์ 100% ํธํ๋๋ Response์ subclass์
๋๋ค. ์์ธํ ๋ด์ฉ์ ResponseProxy
ํญ๋ชฉ์ ์ฐธ๊ณ ํ์ธ์.
๊ธฐ๋ฅ์ ์ ์ดํดํ์ง ๋ชปํ๋ค๋ฉด ๊ธฐ์กด์ Response๋ฅผ ์ฌ์ฉํ๋ ๋ฐฉ์๋๋ก ์ฌ์ฉํ์๋ฉด ๋ฌธ์ ์์ด ์๋ํฉ๋๋ค.
attempts
๋ ํ๋ผ๋ฏธํฐ๋ก, ๋ชจ์ข
์ ์ด์ ๋ก ConnectionError
๊ฐ ๋ฐ์ํ์ ๋ ๊ฐ์ requests๋ฅผ ๋ช ๋ฒ ๋ ๋ฐ๋ณตํ ๊ฒ์ธ์ง ์ค์ ํ๋ ํ๋ผ๋ฏธํฐ์
๋๋ค.
๋ง์ฝ 10๋ฒ์ ์คํํ๊ณ ๋ ์คํจํ๋ค๋ฉด ๊ฐ์ฅ ์ต๊ทผ์ ์คํจํ ์ฐ๊ฒฐ์ ์ด์ ๋ฅผ ๋ณด์ฌ์ค๋๋ค.
>>> from resoup import requests
>>>
>>> requests.get('https://some-not-working-website.com', attempts=10)
WARNING:root:Retring...
WARNING:root:Retring...
WARNING:root:Retring...
WARNING:root:Retring...
WARNING:root:Retring...
WARNING:root:Retring...
WARNING:root:Retring...
WARNING:root:Retring...
WARNING:root:Retring...
WARNING:root:Retring...
Traceback (most recent call last):
...
socket.gaierror: [Errno 11001] getaddrinfo failed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
...
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at ...>: Failed to resolve 'some-not-working-website.com' ([Errno 11001] getaddrinfo failed)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
...
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='some-not-working-website.com', port=443): Max retries exceeded with url: / (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at ...>: Failed to resolve 'some-not-working-website.com' ([Errno 11001] getaddrinfo failed)"))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='some-not-working-website.com', port=443): Max retries exceeded with url: / (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at ...>: Failed to resolve 'some-not-working-website.com' ([Errno 11001] getaddrinfo failed)"))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
...
ConnectionError: Trying 10 times but failed to get data.
URL: https://some-not-working-website.com
avoid_sslerror
๋ UNSAFE_LEGACY_RENEGOTIATION_DISABLED
์ผ๋ก ์ธํด ์ค๋ฅ๊ฐ ๋ํ๋๋ ์ฌ์ดํธ์์ ์ฌ์ฉํ ์ ์์ต๋๋ค.
์๋ฅผ ๋ค์ด ๋ค์์ ์ฌ์ดํธ๋ avoid_sslerror
์์ด๋ ๋ค์๊ณผ ๊ฐ์ ์ค๋ฅ๋ฅผ ์ผ์ผํต๋๋ค.
>>> from resoup import requests
>>> requests.get('https://bufftoon.plaync.com')
---------------------------------------------------------------------------
SSLError Traceback (most recent call last)
...
SSLError: HTTPSConnectionPool(host='bufftoon.plaync.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: UNSAFE_LEGACY_RENEGOTIATION_DISABLED] unsafe legacy renegotiation disabled (_ssl.c:1000)')))
avoid_sslerror
๋ฅผ True
๋ก ํ๋ฉด ํด๋น ์ค๋ฅ๋ฅผ ํผํ ์ ์์ต๋๋ค.
<Response [200]>
์ผ๋ฐ requests.get/options/head/post/put/patch/delete๋ฅผ requests
์์ ์ฌ์ฉํ๋ ๋ฐฉ์ ๊ทธ๋๋ก ์ฌ์ฉํ ์ ์์ต๋๋ค.
๋ค์์ requests.get๊ณผ post์ ์์์
๋๋ค. requests
๋ชจ๋๊ณผ ๋๊ฐ์ด ์๋ํฉ๋๋ค.
>>> from resoup import requests
>>>
>>> requests.get('https://jsonplaceholder.typicode.com/todos/1').json() # API that can send request in order to test. Don't execute this command unless you trust this API.
{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}
>>> requests.post('https://jsonplaceholder.typicode.com/todos', json={
... 'title': 'foo',
... 'body': 'bar',
... 'userId': 1,
... }).json()
{'title': 'foo', 'body': 'bar', 'userId': 1, 'id': 201} # Same with original requests library
์ผ๋ฐ requests.get/../delete ์์ฒญ๊ณผ ๋์ผํ์ง๋ง ์บ์๋ฉ๋๋ค. ์ด๋ ์บ์๋ ํ์ ํ ๋น๋๊ธฐ์ ์ด๋ฉฐ ์บ์๋ ์์ฒญ ํจ์
์ ๊ณต์ ๋ฉ๋๋ค. ํ์ง๋ง ๊ฐ ๋ฉ์๋๋ค๋ผ๋ฆฌ ๊ณต์ ๋์ง๋ ์์ต๋๋ค. ์์ c
๋ฅผ ๋ถ์ฌ requests.cget/coptions/chead/cpost/cput/cpatch/cdelete๋ก ํจ์๋ฅผ ์์ฑํด ์ฌ์ฉํ ์ ์์ต๋๋ค.
๊ฐ์ URL์ ๋ณด๋ด๋ ๋ค๋ฅธ ๊ฒฐ๊ณผ๋ฅผ ์๋ตํ ์ ์๋ ๋์ ์ธ ์๋น์ค๋ฅผ ์ฌ์ฉํ๊ฑฐ๋(์๊ฐ์ ๋ฐ๋ฅธ ์๋ต์ ๋ณํ๋ฅผ ๋ฐ์ํ์ง ์์) ์๋ต์ ํฌ๊ธฐ๊ฐ ํด ๊ฒฝ์ฐ(๋ฉ๋ชจ๋ฆฌ๊ฐ ๋ญ๋น๋ ์ ์์) ์ฌ์ฉํ์ง ์๋ ๊ฒ์ด ์ข์ต๋๋ค.
>>> # ๊ธฐ๊ธฐ ์ฌ์๊ณผ ์ธํฐ๋ท ์ฐ๊ฒฐ ํ์ง์ ๋ฐ๋ผ ๊ฒฐ๊ณผ๋ ๋ค๋ฅผ ์ ์์
>>> import timeit
>>>
>>> timeit.timeit('requests.get("https://python.org")', number=10, setup='from resoup import requests')
1.1833231999917189 # ๊ธฐ๊ธฐ ์ฌ์๊ณผ ์ธํฐ๋ท ์ฐ๊ฒฐ ํ์ง์ ๋ฐ๋ผ ๋ค๋ฆ: 10๋ฒ์ ์ฐ๊ฒฐ ๋ชจ๋ request๋ฅผ ๋ณด๋
>>> timeit.timeit('requests.cget("https://python.org")', number=10, setup='from resoup import requests')
0.10267569999268744 # : ์ฒ์ ํ ๋ฒ๋ง request๋ฅผ ๋ณด๋ด๊ณ ๊ทธ ๋ค๋ ์บ์์์ ๊ฐ์ ๋ถ๋ฌ์ด
๋น๋๊ธฐ์ ์ธ ์์ฒญ์ ๋ณด๋
๋๋ค. ์์ a
๋ฅผ ๋ถ์ฌ requests.aget/aoptions/ahead/apost/aput/apatch/adelete๋ก ํจ์๋ฅผ ์์ฑํฉ๋๋ค.
run_in_executer
๋ ๊ธฐ๋ณธ์ ์ผ๋ก ์ผ์ ธ ์์ต๋๋ค. ์์ธํ ๋ด์ฉ์ ์๋์ run_in_executer ์ฌ์ฉ
์ ์ฐธ๊ณ ํ์ธ์.
>>> import asyncio
>>>
>>> from resoup import requests
>>>
>>> res = asyncio.run(requests.aget('https://python.org'))
>>> res
<response [200]>
๋น๋๊ธฐ์ ์ด๋ฉฐ ์บ์๋๋ ์์ฒญ์
๋๋ค. ์ด๋ ์บ์๋ ๊ฐ์ ๋ฉ์๋๋ผ๋ฉด ์บ์๋ ์์ฒญ ํจ์
์ ๊ณต์ ๋ฉ๋๋ค. ์์ ac
๋ฅผ ๋ถ์ฌ requests.acget/acoptions/achead/acpost/acput/acpatch/acdelete๋ก ํจ์๋ฅผ ์์ฑํฉ๋๋ค.
๊ฐ์ URL์ ๋ณด๋ด๋ ๋ค๋ฅธ ๊ฒฐ๊ณผ๋ฅผ ์๋ตํ ์ ์๋ ๋์ ์ธ ์๋น์ค๋ฅผ ์ฌ์ฉํ๊ฑฐ๋(์๊ฐ์ ๋ฐ๋ฅธ ์๋ต์ ๋ณํ๋ฅผ ๋ฐ์ํ์ง ์์) ์๋ต์ ํฌ๊ธฐ๊ฐ ํด ๊ฒฝ์ฐ(๋ฉ๋ชจ๋ฆฌ๊ฐ ๋ญ๋น๋ ์ ์์) ์ฌ์ฉํ์ง ์๋ ๊ฒ์ด ์ข์ต๋๋ค.
run_in_executer
๋ ๊ธฐ๋ณธ์ ์ผ๋ก ์ผ์ ธ ์์ต๋๋ค. ์์ธํ ๋ด์ฉ์ ์๋์ run_in_executer ์ฌ์ฉ
์ ์ฐธ๊ณ ํ์ธ์.
>>> import asyncio
>>> import timeit
>>>
>>> timeit.timeit('asyncio.run(requests.aget("https://python.org"))', number=10, setup='from resoup import requests; import asyncio')
0.8676127000362612 # ๊ธฐ๊ธฐ ์ฌ์๊ณผ ์ธํฐ๋ท ์ฐ๊ฒฐ ํ์ง์ ๋ฐ๋ผ ๋ค๋ฆ: 10๋ฒ์ ์ฐ๊ฒฐ ๋ชจ๋ request๋ฅผ ๋ณด๋
>>> timeit.timeit('asyncio.run(requests.acget("https://python.org"))', number=10, setup='from resoup import requests; import asyncio')
0.11984489997848868 # ์ฒ์ ํ ๋ฒ๋ง request๋ฅผ ๋ณด๋ด๊ณ ๊ทธ ๋ค๋ ์บ์๋ฅผ ๋ถ๋ฌ์ด
๋น๋๊ธฐ์ ์ธ ์์ฒญ(aget, acget ๋ฑ a๊ฐ ๋ถ์ ๋ฉ์๋)์์๋ run_in_executer
parameter๋ฅผ ์ฌ์ฉํ ์ ์์ต๋๋ค. ์ด parameter๋ ํจ์๊ฐ ๋ค๋ฅธ ์ฐ๋ ๋์์ ๋๊ฒ ํฉ๋๋ค. ์์ฐจ์ ์ผ๋ก ํ๋ก๊ทธ๋จ์ด ๋์ํ ๋์๋ ํฐ ์ฐจ์ด๊ฐ ์์ง๋ง ๋ณ๋ ฌ์ ์ผ๋ก ํ๋ก๊ทธ๋จ์ ๋๋ฆด ๋ ํฐ ์๋ ํฅ์์ ๊ธฐ๋ํ ์ ์์ต๋๋ค.
์๋์ ๊ฐ์ด asyncio.gather
๋ฅผ ์ด์ฉํ๋ฉด ํฐ ์ฑ๋ฅ ํฅ์์ ๋ณด์ผ ์ ์์ต๋๋ค.
import asyncio
import time
from resoup import requests
async def masure_coroutine_time(coroutine):
start = time.perf_counter()
await coroutine
end = time.perf_counter()
print(end - start)
async def main():
# ๋จ์ผ request๋ฅผ ๋ณด๋ผ ๋(ํฐ ์ฐจ์ด ์์)
req = requests.aget('https://python.org', run_in_executor=False)
await masure_coroutine_time(req) # 0.07465070000034757
req = requests.aget('https://python.org')
await masure_coroutine_time(req) # 0.05844969999452587
# ์ฌ๋ฌ request๋ฅผ ๋ณด๋ผ ๋(ํฐ ์๋ ํฅ์์ ๋ณด์)
reqs = (requests.aget(f'https://python.org/{i}', run_in_executor=False) for i in range(10)) # ๋๋ฏธ url์ ๋ง๋ฆ
await masure_coroutine_time(asyncio.gather(*reqs)) # run_in_executor๋ฅผ ์ฌ์ฉํ์ง ์์ ๋: ๋๋ฆผ(3.7874760999984574)
reqs = (requests.aget(f'https://python.org/{i}') for i in range(10)) # ๋๋ฏธ url์ ๋ง๋ฆ
await masure_coroutine_time(asyncio.gather(*reqs)) # run_in_executor๋ฅผ ์ฌ์ฉํ ๋(๊ธฐ๋ณธ๊ฐ): ๋น ๋ฆ(0.11582900000212248)
if __name__ == '__main__':
asyncio.run(main())
์ด ๋ชจ๋์ requests
๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ๊ฑฐ์ ๋ชจ๋ ๋ถ๋ถ์์ ํธํ๋์ง๋ง ํธํ๋์ง ์๋ ๋ถ๋ถ์ด ๋ช ๊ฐ์ง ์์ต๋๋ค.
์ ์ ์ ๋ฒ๊ทธ์ ์ด์ ๊ฐ ๋ ์ ์๋ค๋ ์ด์ ํน์ ๊ธฐ์ ์ ์ธ ์ด์ ๋ก ์ผ๋ถ dunder method๋ ๋ถ๋ฌ์์ง์ง ์๊ฑฐ๋ ํธํ๋์ง ์์ต๋๋ค.
์ฌ์ฉํ ์ ์๊ฑฐ๋ requests ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์ผ์นํ์ง ์๋ dunder method: __builtins__
, __cached__
, __doc__
, __file__
, __loader__
, __name__
, __package__
, __spec__
์ฌ์ฉ ๊ฐ๋ฅํ๊ณ requests ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์ผ์นํ๋ dunder method: __author__
, __author_email__
, __build__
, __cake__
, __copyright__
, __description__
, __license__
, __title__
, __url__
, __version__
>>> import requests
>>> requests.__name__
'requests'
>>> requests.__path__
['some path']
>>> requests.__cake__
'โจ ๐ฐ โจ'
>>>
>>> from resoup import requests
>>> requests.__name__ # ํธํ๋์ง ์๋ dunder method
'resoup.requests_proxy' # requests์ ๊ฐ์ด ๋ค๋ฆ
>>> requests.__path__ # ์ฌ์ฉํ ์ ์๊ณ ํธํ๋์ง ์๋ dunder method
AttributeError: module 'resoup.requests_' has no attribute '__path__'
>>> requests.__cake__ # ํธํ๋๋ dunder method
'โจ ๐ฐ โจ'
resoup.requests
๋ ๊ฑฐ์ ๋ชจ๋ ๊ฒฝ์ฐ์์ import ๊ด๋ จ ํธํ์ฑ์ด ์ ์ง๋ฉ๋๋ค. ํ์ง๋ง import์ ๊ด๋ จํด์๋ ๋ช ๊ฐ์ง ๊ท์น์ด ์กด์ฌํฉ๋๋ค.
resoup.requests
๋ from resoup import requests
์ ํํ๋ก๋ง ์ฌ์ฉํ ์ ์์ต๋๋ค.
# ๊ฐ ๋ผ์ธ์์ ์์ค๊ณผ ์๋ซ์ค์ ๊ฐ๊ฐ requests๋ฅผ import ํ ๋์ `resoup.requests`๋ฅผ importํ ๋๋ฅผ ๋ํ๋
๋๋ค.
# requests ๋ชจ๋ import
import requests
from resoup import requests # ๊ฐ๋ฅ
๋ฐ๋ผ์ ๋ค์๊ณผ ๊ฐ์ ๊ฒฝ์ฐ๋ resoup.requests
์์ import๊ฐ ๋ถ๊ฐ๋ฅํฉ๋๋ค.
# requests์ ํ์ ๋ชจ๋ import
import requests.models # ๊ฐ๋ฅ
import resoup.requests.models # ๋ถ๊ฐ๋ฅ!
# requests์ ํ์ ๋ชจ๋ import (w/ from .. import ...)
from request import models # ๊ฐ๋ฅ
from resoup.requests import models # ๋ถ๊ฐ๋ฅ!
# requests์ ํ์ ๋ชจ๋์ ํ์ ๊ตฌ์ฑ ์์ import
from request.models import Response # ๊ฐ๋ฅ
from resoup.requests.models import Response # ๋ถ๊ฐ๋ฅ!
์ด๋ฐ ๊ฒฝ์ฐ์ ๋ชจ๋ import๋ฅผ ์ด์ฉํ๋ฉด ํด๊ฒฐ๋ฉ๋๋ค..
์๋ฅผ ๋ค์ด ๋ค์๊ณผ ๊ฐ์ ์ฝ๋๊ฐ ์๋ค๊ณ ํด ๋ด ์๋ค.
from request.models import Response # ํ์ ๋ชจ๋์ ํ์ ๊ตฌ์ฑ ์์ import ์ฌ์ฉ
def is_response(instance):
return isinstance(instance, Response)
์ด ์ฝ๋๋ ๋ค์๊ณผ ๊ฐ์ด ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ ์ ์์ต๋๋ค.
# requests.models.Response๋ก ๋ฐ๊พธ๊ธฐ.
# ์ฅ์ : ๊น๋ํ๊ณ error-proneํ์ง ์์.
from resoup import requests # requests ๋ชจ๋ import
def is_response(instance):
return isinstance(instance, requests.models.Response) # requests.models.Response๋ก ๋ณ๊ฒฝํจ
# Response ์ ์ํ๊ธฐ.
# ์ฅ์ : ์ฝ๋๋ฅผ ์์ ํ ํ์๊ฐ ์์.
from resoup import requests
Response = requests.models.Response
def is_response(instance):
return isinstance(instance, Response)
๊ฐ์ธ์ ์ ํธ์ ๋ฐ๋ผ ์ํ๋ ๋ฐฉ์์ผ๋ก ์ฌ์ฉํ์๋ฉด ๋ฉ๋๋ค.
ResponseProxy
๋ ์ด ๋ผ์ด๋ธ๋ฌ๋ฆฌ์์ requests.get/options/head/post/put/patch/delete๋ฅผ ์ฌ์ฉํ ๊ฒฝ์ฐ์ ๋ฆฌํด๊ฐ์
๋๋ค. ๊ธฐ์กด Response์ 100% ํธํ๋๋ฉด์๋ ์ถ๊ฐ์ ์ธ ํจ์ 6๊ฐ๋ฅผ ์ ๊ณตํฉ๋๋ค.
์ด ํํธ์์๋ ์ฃผ์์ ๋ด์ฉ์ ์ ์์ต๋๋ค.
>>> # ๋ ๋ชจ๋์ ๋์์ ์ฌ์ฉํด์ผ ํ๋ ์ด๋ฆ์ ๋ณ๊ฒฝํ๊ฒ ์ต๋๋ค.
>>> import requests as orginal_requests
>>> from resoup import requests as utils_requsts
>>>
>>> # requests ๋ชจ๋์ Response๋ฅผ ์๋ตํฉ๋๋ค.
>>> response1 = orginal_requests.get("https://peps.python.org/pep-0020/") # ์ ์ ์ธ ์น์ฌ์ดํธ
>>> print(response1)
<Response [200]>
>>> print(type(response1)) # Response ๊ฐ์ฒด
<class 'requests.models.Response'>
>>> # resoup.requests๋ชจ๋์ ResponseProxy๋ฅผ ์๋ตํฉ๋๋ค.
>>> response2 = utils_requsts.get("https://peps.python.org/pep-0020/")
>>> print(response2)
<Response [200]>
>>> print(type(response2)) # ResponseProxy ๊ฐ์ฒด
<class 'resoup.response_proxy.ResponseProxy'>
>>>
>>> # ๋ค์์ ๋ชจ๋ ๊ฒ์ฌ๋ค์ ํต๊ณผํฉ๋๋ค.
>>> assert response1.text == response2.text
>>> assert response1.status_code == response2.status_code
>>> assert response1.url == response2.url
>>> assert response1.content == response2.content
>>>
>>> # ํ์ง๋ง RequestsProxy์๋ ์ด๋ฌํ ์ถ๊ฐ์ ์ธ ๊ธฐ๋ฅ๋ค์ด ์กด์ฌํฉ๋๋ค.
>>> print(response2.soup())
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
...
<script src="../_static/wrap_tables.js"></script>
<script src="../_static/sticky_banner.js"></script>
</body>
</html>
>>> print(response2.soup_select('title'))
[<title>PEP 20 โ The Zen of Python | peps.python.org</title>, <title>Following system colour scheme</title>, <title>Selected dark colour scheme</title>, <title>Selected light colour scheme</title>]
>>> print(response2.soup_select_one('p', no_empty_result=True).text)
Long time Pythoneer Tim Peters succinctly channels the BDFLโs guiding
principles for Pythonโs design into 20 aphorisms, only 19 of which
have been written down.
>>>
>>> from requests.models import Response
>>> # RequestsProxy๋ Requsests์ subclass์
๋๋ค.
>>> # ๋ฐ๋ผ์ isinstance ๊ฒ์ฌ๋ฅผ ํต๊ณผํฉ๋๋ค.
>>> isinstance(response2, Response)
True
>>> # ๋ฌผ๋ก subclass์ด๊ธฐ ๋๋ฌธ์ '==' ๊ฒ์ฌ๋ ํต๊ณผํ์ง ์์ต๋๋ค.
>>> type(response1) == type(response2)
False
ResponseProxy
์๋ ์ฌ๋ฌ ๋ชจ๋๋ค์ด ์์ผ๋ฉฐ, ํฌ๊ฒ ์ธ ๊ฐ์ง ์ข
๋ฅ๋ก ๋ถ๋ฅ๋ฉ๋๋ค.
- soup๋ฅ:
.soup()
,.soup_select()
,.soup_select_one()
๊ธฐ๋ณธ์ ์ธ ํจ์์ ๋๋ค. - xml๋ฅ:
.xml()
,.xml_select()
,.xml_select_one()
soup๋ฅ์์ parser๊ฐ 'xml'์ธ ๊ฒฝ์ฐ์ ๋๋ค.
๊ฐ๊ฐ์ ์ข ๋ฅ์๋ ์ธ ๊ฐ์ง ํจ์๊ฐ ์์ผ๋ฉฐ ํจ์ ๊ฐ๊ฐ์ ๊ธฐ๋ฅ์ ๋ค์๊ณผ ๊ฐ์ต๋๋ค.
-
.soup()
/.xml()
: BeatifulSoup๋ก ํด์๋ ์ฝ๋๊ฐ ๋์ต๋๋ค. -
.soup_select()
/.xml_select()
:.soup().select()
์ ๋น์ทํฉ๋๋ค. -
.soup_select_one()
/.xml_select_one()
:.soup().select_one()
๊ณผ ๋น์ทํฉ๋๋ค.
์์ธํ ๋ด์ฉ์ ์๋๋ฅผ ์ดํด๋ณด์ธ์.
.soup()
๋ ํ
์คํธ๋ response๋ฅผ ๋ฐ์ BeatifulSoup
๋ก ๋ด๋ณด๋
๋๋ค.
์ด๋ ์ธ์๋ response์ response.text ๋ชจ๋ ๊ฐ๋ฅํ์ง๋ง response๋ฅผ ์ฌ์ฉํ๋ ๊ฒ์ ๊ถํฉ๋๋ค. ๊ทธ๋ฌ๋ฉด ๋์ฑ ์์ธํ ์ค๋ฅ ๋ฉ์์ง๋ฅผ ๋ฐ์ ์ ์์ต๋๋ค.
>>> from resoup import requests
>>>
>>> response = requests.get("https://python.org")
>>> response.soup() # BeatifulSoup์์ ์ฌ์ฉ ๊ฐ๋ฅํ ๋ชจ๋ parameter ์ฌ์ฉ ๊ฐ๋ฅ
<!DOCTYPE html>
...
</body>
</html>
์ด ํจ์๋ ์ฌ์ค์ BeatifulSoup
๋ฅผ ํต๊ณผ์ํค๋ ๊ฒ๊ณผ ๊ฐ์ต๋๋ค. ์๋์ ์ฝ๋๋ ์์ ์ฝ๋์ ๊ฑฐ์ ๊ฐ์ต๋๋ค.
>>> import requests
>>> from bs4 import BeautifulSoup
>>>
>>> response = requests.get("https://python.org")
>>> BeautifulSoup(response.text)
<!DOCTYPE html>
<!DOCTYPE html>
...
</body>
</html>
parser๊ฐ ์์ ๊ฒฝ์ฐ BeatifulSoup
๋ FeatureNotFound
์๋ฌ๊ฐ ๋์ค์ง๋ง .soup()
๋ NoParserError
๊ฐ ๋์ต๋๋ค.
.soup_select()
๋ ํ
์คํธ๋ response๋ฅผ ๋ฐ์ BeatifulSoup์ Tag๋ก ๋ด๋ณด๋
๋๋ค. selector
parameter๋ CSS ์ ํ์๋ฅผ ๋ฐ์ต๋๋ค.
>>> from resoup import requests
>>>
>>> response = requests.get("https://python.org")
>>> response.soup_select("p")
[<p><strong>Notice:</strong> While JavaScript is not essential for this website
...]
์๋์ ์ฝ๋๋ ์์ ์ฝ๋์ ์ ์ฌํ๊ฒ ๋์ํฉ๋๋ค.
>>> import requests
>>> from bs4 import BeautifulSoup
>>>
>>> response = requests.get('https://python.org')
>>> soup = BeautifulSoup(response.text).select('p')
>>> soup
[<p><strong>Notice:</strong> While JavaScript is not essential for this website
...]
์ด ํจ์์ ๋
ํนํ ์ ์, no_empty_result
๋ผ๋ parameter์ ์กด์ฌ์
๋๋ค. ์ด parameter๊ฐ True์ด๋ฉด .select()์ ๊ฒฐ๊ณผ๊ฐ ๋น ๋ฆฌ์คํธ์ผ๋ EmptyResultError
๋ฅผ ๋
๋๋ค.
>>> from resoup import requests
>>>
>>> response = requests.get("https://python.org")
>>> response.soup_select("data-some-complex-and-error-prone-selector")
[]
>>>
>>> response = requests.get("https://python.org")
>>> response.soup_select(
... "data-some-complex-and-error-prone-selector",
... no_empty_result=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "...souptools.py", line 148, in soup_select
raise EmptyResultError(
resoup.exceptions.EmptyResultError: Result of select is empty list("[]"). This error happens probably because of invalid selector or URL. Check if both selector and URL are valid. Set to False `no_empty_result` if empty list is intended. It may also because of selector is not matched with URL.
selector: data-some-complex-and-error-prone-selector, URL: https://www.python.org/
์ด ํจ์๋ฅผ ๊ธฐ๋ณธ์ ์ผ๋ก BroadcastList๋ฅผ ์ถ๋ ฅ๊ฐ์ผ๋ก ์ค์ ํ๊ณ ์์ต๋๋ค. BroadcastList์ ๋ํด ์์ธํ ์๊ณ ์ถ๋ค๋ฉด ์๋์ BroadcastList
ํญ๋ชฉ์ ํ์ธํด ๋ณด์ธ์.
.soup_select_one()
๋ ํ
์คํธ๋ response๋ฅผ ๋ฐ์ BeatifulSoup์ Tag๋ก ๋ด๋ณด๋
๋๋ค. selector
parameter๋ CSS ์ ํ์๋ฅผ ๋ฐ์ต๋๋ค.
>>> from resoup import requests
>>>
>>> response = requests.get('https://python.org')
>>> response.soup_select_one('p strong', no_empty_result=True)
<strong>Notice:</strong>
์๋์ ์ฝ๋๋ ์์ ์ฝ๋์ ์ ์ฌํ๊ฒ ๋์ํฉ๋๋ค.
>>> import requests
>>> from bs4 import BeautifulSoup
>>>
>>> response = requests.get('https://python.org')
>>> soup = BeautifulSoup(response.text, 'html.parser').select('p strong')
>>> if soup is None: # no_empty_result ๊ด๋ จ ํ์ธ ์ฝ๋
... raise Exception
...
>>> soup
<strong>Notice:</strong>
no_empty_result
parameter๊ฐ True์ด๋ฉด .select_one()์ ๊ฒฐ๊ณผ๊ฐ None์ผ๋ EmptyResultError
๋ฅผ ๋
๋๋ค.
์ด ๊ธฐ๋ฅ์ ํ์ ํํธ์์๋ ์ ์ฉํ๊ฒ ์ฐ์ผ ์ ์๊ณ , ์ค๋ฅ๋ฅผ ๋ ๋ช ํํ ํ๋ ๋ฐ์๋ ๋์์ ์ค๋๋ค.
๊ธฐ์กด BeatifulSoup์์๋ .select_one()
์ ๋ฆฌํด๊ฐ์ Tag | None
์ผ๋ก ํ์ํ๊ธฐ ๋๋ฌธ์ ๋ง์ฝ .select_one().text
์ ๊ฐ์ ์ฝ๋๋ฅผ ์ฌ์ฉํ๋ ค๊ณ ํ๋ฉด ์ ์ ํ์
๊ฒ์ฌ ๋๊ตฌ๋ค์์ ์ค๋ฅ๋ฅผ ๋ฐ์์์ผฐ์ต๋๋ค.
ํนํ .select_one()
์ ๊ฒฐ๊ณผ๊ฐ None์ด ๋๋ฉด 'NoneType' object has no attribute 'text'
๋ผ๋ ์ด๋ค ๋ถ๋ถ์์ ์ค๋ฅ๊ฐ ๋ฌ๋์ง ํ๋์ ํ์ธํ๊ธฐ ํ๋ ์ค๋ฅ ๋ฉ์์ง๊ฐ ๋์์ต๋๋ค.
no_empty_result
๋ฅผ ์ด์ฉํ๋ฉด ์ด๋ฌํ ๋ฌธ์ ๋ค์ ํด๊ฒฐํ ์ ์์ต๋๋ค.
no_empty_result
๋ฅผ True๋ก ํ๋ฉด ํ์
๊ฒ์ฌ ๋๊ตฌ๋ค๋ ์กฐ์ฉํด์ง๊ณ , ํน์๋ผ๋ None์ด ๊ฒฐ๊ณผ๊ฐ์ด ๋ ๋ ๋์ ํจ์ฌ ๋ ์์ธํ๋ฉฐ ํด๊ฒฐ์ฑ
์ ํฌํจํ ์ค๋ฅ ๋ฉ์์ง๋ฅผ ๋ง๋ค์ด ๋
๋๋ค.
>>> from resoup import requests
>>>
>>> response = requests.get("https://python.org")
>>> print(response.soup_select_one("data-some-complex-and-error-prone-selector"))
None # ์ค์ ๋ก None์ด ๊ฒฐ๊ณผ๊ฐ์ผ๋ก ๋์ค์ง ์๊ณ ๊ทธ๋ฅ ์กฐ์ฉํ ์ข
๋ฃ๋จ.
>>>
>>> response = requests.get("https://python.org")
>>> response.soup_select_one(
... "data-some-complex-and-error-prone-selector",
... no_empty_result=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "...souptools.py", line 220, in soup_select_one
raise EmptyResultError(
resoup.exceptions.EmptyResultError: Result of select_one is None. This error happens probably because of invalid selector or URL. Check if both selector and URL are valid. Set to False `no_empty_result` if empty list is intended. It may also because of selector is not matched with URL.
selector: data-some-complex-and-error-prone-selector, URL: https://www.python.org/
ResponseProxy
์ soup
๊ด๋ จ ํจ์์์ soup
๋ฅผ xml
๋ก ์นํํ๋ฉด xml ํจ์๊ฐ ๋ฉ๋๋ค.
์ด ํจ์๋ค์ parser๊ฐ 'xml'
์ด๋ผ๋ ์ ์ ์ ์ธํ๊ณ ๋ soup์ ์ฐจ์ด์ ์ด ์์ต๋๋ค.
์์ ์ฝ๋๋ ๋ค์๊ณผ ๊ฐ์ต๋๋ค
>>> from resoup import requests
>>>
>>> response = requests.get('https://www.w3schools.com/xml/plant_catalog.xml')
>>> selected = response.xml_select('LIGHT', no_empty_result=True)
>>> selected
[<LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Sunny</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Sunny</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Sunny</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Sunny</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Sun</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>]
์์ ์ฝ๋๋ ์๋์ ์ฝ๋์ ๊ฑฐ์ ๊ฐ์ต๋๋ค.
>>> from resoup import requests
>>> from functools import partial
>>>
>>> response = requests.get('https://www.w3schools.com/xml/plant_catalog.xml')
>>> # corespond to `.xml_select()`
>>> xml_select_partial = partial(response.soup_select, parser='xml')
>>> selected = xml_select_partial('LIGHT', no_empty_result=True)
>>> selected
[<LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Sunny</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Sunny</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Sunny</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Sunny</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Sun or Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Sun</LIGHT>, <LIGHT>Mostly Shady</LIGHT>, <LIGHT>Shade</LIGHT>, <LIGHT>Shade</LIGHT>]
.soup_select()
์ .xml_select()
์ ๊ฒฝ์ฐ์๋ ๋ฆฌ์คํธ๋ฅผ ๊ฐ์ผ๋ก ๋ด๋ณด๋
๋๋ค. ์ด๋ .soup()
๋ .soup_select_one()
์์ ๊ธฐ๋ํ ์ ์๋ .text
์ ๊ฐ์ ํ๋ผ๋ฏธํฐ ์ฌ์ฉ์ ์ด๋ ต๊ฒ ํฉ๋๋ค.
์ด๋ for loop๋ ๋ฆฌ์คํธ ์ปดํ๋ฆฌํจ์ ์ผ๋ก ํด๊ฒฐํ ์ ์์ต๋๋ค.
>>> from resoup import requests
>>> tags_list = requests.get("https://python.org").soup_select("p strong")
>>> [element.text for element in tags_list]
['Notice:', 'relaunched community-run job board']
ํ์ง๋ง ์ด๊ฒ์ด ๋ง์์ ๋ค์ง ์์ ์๊ฐ ์์ต๋๋ค. ํนํ ๊ฐ๋ฐ ์ค์ด๋ผ๋ฉด ๋น ๋ฅธ ๊ฐ๋ฐ ์๋๋ฅผ ์ํด for loop๋ ๋ฆฌ์คํธ ์ปดํ๋ฆฌํจ์
์ ์ฌ์ฉํ๋ ๊ฒ ์ธ์ ๋ ์ ์ํ๊ฒ .text
๋ฑ์ ์ ์ฉํ๋ ๋ฐฉ๋ฒ์ ๊ณ ๋ คํ๊ณ ์ถ์ ์ ์์ต๋๋ค.
์ด ํ๋ก์ ํธ์ .soup_select()
์ ๊ธฐ๋ณธ ๋ฆฌํด๊ฐ์ผ๋ก ์ค์ ๋ BroadcastList๋ ์ด๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํ ๋ฐฉํธ์
๋๋ค.
BroadcastList์์๋ ๋ฆฌ์คํธ๋ฅผ ํตํด ์ง์ Tag์์ ์ฌ์ฉ๋๋ ์์ฑ์ ์ฌ์ฉํ ์ ์์ต๋๋ค.
>>> from resoup import requests
>>> tags_list = requests.get("https://python.org").soup_select("p strong")
>>> tags_list
[<strong>Notice:</strong>, <strong>relaunched community-run job board</strong>]
>>> type(tags_list)
<class 'resoup.broadcast_list.TagBroadcastList'> # BroadcastList๊ฐ ์ฌ์ฉ๋จ
>>> tags_list.text # ๋ธ๋ก๋์บ์คํ
['Notice:', 'relaunched community-run job board']
>>>
>>> tags_list_with_no_broadcast_list = requests.get('https://python.org').soup_select('p', use_broadcast_list=False)
>>> type(tags_list_with_no_broadcast_list)
<class 'bs4.element.ResultSet'> # BroadcastList๊ฐ ์ฌ์ฉ๋์ง ์์
>>> tags_list_with_no_broadcast_list.text
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "...element.py", line 2428, in __getattr__
raise AttributeError(
AttributeError: ResultSet object has no attribute 'text'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
BroadcastList๋ ๋ค์๊ณผ ๊ฐ์ ๋ฐฉ๋ฒ์ ํตํด ๋ ์ ์์ต๋๋ค.
>>> from resoup import requests
>>>
>>> tags_list = requests.get("https://python.org").soup_select("p", use_broadcase_list=False)
>>> type(tags_list)
bs4.element.ResultSet
>>> tags_list.text # ๋ธ๋ก๋์บ์คํ
์ ๋จ
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "...element.py", line 2428, in __getattr__
raise AttributeError(
AttributeError: ResultSet object has no attribute 'text'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
BroadCastList์์๋ ๋ค์๊ณผ ๊ฐ์ ํน์ดํ ๊ธฐ๋ฅ์ด ์์ต๋๋ค.
๋ง์ฝ ๋ฆฌ์คํธ์ ์ ์๋ ์ฌ๋ผ์ด์ค๋ก getitem์ ์์ฒญํ๋ค๋ฉด ์ผ๋ฐ์ ์ธ ๋ฆฌ์คํธ์ ์ญํ ์ ์ํํฉ๋๋ค.
>>> from resoup import requests
>>> # ๊ฐ ๋ถ๋ฌ์ด()
>>> tag_broadcast_list = requests.cget("https://www.python.org/community/logos/").soup_select("img")
>>> tag_broadcast_list
[<img alt="Python Software Foundation" class="psf-logo" src="/static/img/psf-logo.png"/>,
...
<img alt="Logo device only" src="https://s3.dualstack.us-east-2.amazonaws.com/pythondotorg-assets/media/community/logos/python-logo-only.png" style="height: 48px;"/>,
<img alt="/static/community_logos/python-powered-w-100x40.png" src="/static/community_logos/python-powered-w-100x40.png"/>,
<img alt="/static/community_logos/python-powered-h-50x65.png" src="/static/community_logos/python-powered-h-50x65.png"/>]
>>> # ์ ์ getitem
>>> tag_broadcast_list[0]
<img alt="Python Software Foundation" class="psf-logo" src="/static/img/psf-logo.png"/>
>>> # ์ฌ๋ผ์ด์ฑ
>>> tag_broadcast_list[3:5]
[<img alt="/static/community_logos/python-powered-w-100x40.png" src="/static/community_logos/python-powered-w-100x40.png"/>,
<img alt="/static/community_logos/python-powered-h-50x65.png" src="/static/community_logos/python-powered-h-50x65.png"/>]
>>> # ๋ฌธ์์ด getitem (๋ธ๋ก๋์บ์คํ
์ ์ฉ๋จ!)
>>> tag_broadcast_list["alt"]
['Python Software Foundation',
'Combined logo',
'Logo device only',
'/static/community_logos/python-powered-w-100x40.png',
'/static/community_logos/python-powered-h-50x65.png']
CustomDefaults
๋ฅผ ํตํด ์ง์ ๊ธฐ๋ณธ๊ฐ์ ์ค์ ํ ์ ์์ต๋๋ค. ์ด ๊ฐ์ผ๋ก ์ผ๋ฐ get/options/head/post/put/patch/delete ๋ฐ c../a../ac.. ํจ์์ ๊ธฐ๋ณธ๊ฐ์ ํจ๊ณผ์ ์ผ๋ก ์ค์ ํ ์ ์์ต๋๋ค.
>>> from resoup import CustomDefaults
>>>
>>> requests = CustomDefaults(headers={'User-Agent': 'User Agent for Test'})
>>> requests.get('https://httpbin.org/headers').json()['headers']['User-Agent']
'User Agent for Test'
์ด ํ๋ก๊ทธ๋จ์ MIT ๋ผ์ด์ ์ค๋ก ๊ณต์ ๋ฉ๋๋ค.
์ด ํ๋ก๊ทธ๋จ์ ์ผ๋ถ๋ requests(Apache License 2.0) ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์๋ ์ฝ๋๋ฅผ ํฌํจํฉ๋๋ค. Some part of this program contains code from requests library.
์ด ํ๋ก๊ทธ๋จ์ ์ผ๋ถ๋ typeshed(Apache License 2.0 or MIT License) ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ์๋ ์ฝ๋๋ฅผ ํฌํจํฉ๋๋ค. Some part of this program contains code from typeshed library.
0.5.2 (2023-12-26): Timeout ์ค๋ฅ๋ attempts์ ๊ฑธ๋ฆด ์ ์๋๋ก ๋ณ๊ฒฝ, root์์ ์ฌ์ฉํ ์ ์๋ ๋ณ์ ์ถ๊ฐ, ๋น๋ ์ฝ๋ ๊ฐ์ , ์ฝ๋ ๊ฐ์
0.5.1 (2023-12-9): ๋ฒ๊ทธ ์์
0.5.0 (2023-12-9): resoup๋ก ์ด๋ฆ ๋ณ๊ฒฝ, ์ BroadcastList ๊ธฐ๋ณธ ์ ์ฉ, poetry ์ฌ์ฉ, ๊ธฐ์กด souptools ๋ชจ๋ ์ ๊ฑฐ ๋ฐ souptoolsclass ๋ชจ๋๋ก ๋์ฒด, ํ ์คํธ ์ถ๊ฐ
0.4.1 (2023-11-4): ๊ธด๊ธ ๋ฒ๊ทธ ์์
0.4.0 (2023-11-4): raise_for_status ๊ธฐ๋ณธ๊ฐ ๋ณ๊ฒฝ, souptoolsclass ์ถ๊ฐ, avoid_sslerror ์ถ๊ฐ
0.3.0 (2023-10-05): BroadcastList ๋ณต์, sessions_with_tools ์ถ๊ฐ
0.2.3 (2023-09-19): header ๊ธฐ๋ณธ๊ฐ ๋ณ๊ฒฝ, ConnectionError์ ์๋ฌ ํ ๊ฐ๋ง ๋ณด์ด๋ ๊ฒ์ผ๋ก ๋ณ๊ฒฝ, attempts๋ก ์ฌ์๋ํ ๋ ์ฑ๊ณตํ์ ๋ ๋ฉ์์ง ์ถ๊ฐ, retry์์ url ์ ๊ฑฐ, setup.py์ ๊ด๋ จ ํ์ผ ๋ณ๊ฒฝ
0.2.2 (2023-09-08): attempt parameter๋ฅผ attempts๋ก ๋ณ๊ฒฝ, BroadcastList ์ ๊ฑฐ
0.2.1 (2023-08-31): py.typed ์ถ๊ฐ, freeze_dict_and_list ์ถ๊ฐ
0.2.0 (2023-08-27): CustomDefaults ์ถ๊ฐ
0.1.1 (2023-08-27): ์ฒซ ๋ฆด๋ฆฌ์ฆ