http requests

http url를 파싱하는 작업을 한다.
1) 파싱하는 모듈은

Beautifulsoup

: http://coreapython.hosting.paran.com/etc/beautifulsoup4.html
이외 feedparser도 있다.
2) http 요청하는 모듈로 requests : http://docs.python-requests.org/en/latest/
이외 urllib를 사용 할수도 있다. : http://coreapython.hosting.paran.com/etc/Python%20and%20HTML%20Processing.htm
3) http 요청이 불가능 한경우, 로컬에 html파일을 저장해두고 Beautiful을 테스트할수 있도록 로컬에 있는 파일을 읽는 함수도 추가
( DHome이라는 클래스는 Newsflash 라는 클래스를 상속받았는데. 일단 여기서는 그냥 지나가자.. )
# coding: utf-8
import unittest
from bs4 import BeautifulSoup
from pprint import pprint
import urllib
import requests
from contextlib import closing
.. 중략 ..
class DHome(Newsflash):
def _html_doc_file(self, filename):
# http 요청이 불가능할 경우, 로컬에서 읽어서 처리
return open('./datafile/'+filename)
def url_requests(self, url ):
# 1) Requests 라이브러리사용

response = requests.get(url)

return response.content

# 2) urllib 라이브러리사용

#f = urllib.urlopen(url)

#s = f.read()

#return s
def get_reference_news(self):
# http 요청이 불가능할 경우, 로컬에서 읽어서 처리
#response_content = self._html_doc_file('daum_h.html')
response_content = self.url_requests("http://m.daum.net/")
pprint response_content
requests로 처리 할때는
method 와 header등 다양하게 requests 를 요청할수 있다.

저작자표시 (새창열림)

'파이썬' 카테고리의 다른 글

로컬파이썬패키지 중 일부패키지를 삭제 : pip uninstall 패키지명 (1)	2015.11.10
파이썬 simple Abcstract factory pattern example (0)	2015.11.10
python virtualenv에서 작업해보자 (0)	2015.11.10
flask request.form.getList (0)	2015.11.10
파이썬 @staticmethod @classmethod 그리고 일반메소드 (0)	2015.11.10

아름답게 나이들게 하소서

http requests

'파이썬' 카테고리의 다른 글

티스토리툴바

http requests

'파이썬' 카테고리의 다른 글

관련글

티스토리툴바