Daum 만화 ㅤㄲㅡㄺ어 보기

글쓴이: FlOw / 작성시간: 금, 2005/08/05 - 8:15오후

daum 만화 중에서 위대한 캣츠비라는 만화의 url를 ㅤㄲㅡㄺ어오는 스크립트입니다
실행후에 catsbe.html 라는 파일로 저장됩니다.
python의 정규식을 공부하면서 작성한거라, 미흡하며 시간도 오래걸립니다 :oops:

#
#
# Daum Manhwa : Cats be
#
#

#

import urllib
import re
urlBase = "http://cartoon.media.daum.net"
urlBase2 = "http://cartoon.media.daum.net/daumtoon/catsbe/list"
pageList = []
urlList = []
imageList = []
subjectList = []
def findImage(url):
    print "Image Url Parsing...",
    fp = urllib.urlopen(url)
    for line in fp.readlines():
        m = re.search("http:\/\/[\w\_\-\.]+\/daum\/(cartoon|news)(\/\d{6}\/\d{2}\/\w+\.jpg)", line)
        if m != None:
            print m.group(2)
            imageList.append(m.group())
    fp.close()
def findUrl(url):
    print "Page Url Parsing..."
    fp = urllib.urlopen(url)
    for line in fp.readlines():
        m = re.search("\/uccmix\/daumtoon\/catsbe(\/\d{6}\/\d{2}/cartoon/v\d{7}\.html)", line)
        if m != None:
            print m.group(1),
            urlList.append(urlBase+m.group())
        m = re.search("class=\"gv_\d{2}_\d{6}\">([^<]+)", line)
        if m != None:
            print m.group(1).rstrip()
            subjectList.append(m.group(1).rstrip())
    fp.close()
def findPage(url):
    print "Page List Parsing..."
    pageList.append(url)
    fp = urllib.urlopen(url)
    for line in fp.readlines():
        m = re.search("\/index-\d+.html", line)
        if m != None:
            print m.group()
            pageList.append(urlBase2+m.group())
    fp.close()

if __name__ == "__main__":
    findPage("http://cartoon.media.daum.net/daumtoon/catsbe/list/index.html")
    for page in pageList:
        findUrl(page)
    for page in urlList:
        findImage(page)
    imageList.reverse()
    subjectList.reverse()
    filename = "catsbe.html"
    no = 0
    fp = open(filename, "w")
    fp.write("<html>\n<body>\n<a href='"+urlBase2+"/index.html'><img src='http://img-media.hanmail.net/15/menu/cartoon/catsbe.gif'  border='0'></a><br/>\n")
    for data in imageList:
        no += 1
        fp.write(str(no)+". <a href='"+data+"'>"+subjectList[no-1]+"</a><br/>\n")
    fp.write("</body>\n</html>\n")
    fp.close()
    print "Saved ./" + filename, "..."

Forums:

강좌

댓글 달기

뭐냐이게

글쓴이: 익명 사용자 / 작성시간: 월, 2005/08/08 - 9:53오후

만들려면 좀 성의있게 만들던가
뭐냐이게
프로그래밍 공부한다고 하지도 마라.

답글

우하하 -> 그럼 당신이 만들어서 공개해 보시지?진짜 병X은 너

글쓴이: hiseob / 작성시간: 월, 2005/08/08 - 9:55오후

우하하 -> 그럼 당신이 만들어서 공개해 보시지?
진짜 병X은 너같은 인간이야 남이 만들어서 공개한거 깎아내리는 사람...
만드신분이 들인시간 백분에 일만큼만 생각하면 자신이 얼마나 찌질한지 알수 있을거야.

답글

[quote][b]우하하[/b]만들려면 좀 성의있게 만들던가 뭐냐

글쓴이: june / 작성시간: 월, 2005/08/08 - 10:08오후

Quote:

우하하
만들려면 좀 성의있게 만들던가
뭐냐이게
프로그래밍 공부한다고 하지도 마라.

손님으로 글쓰기에 뭔가 조치가 필요하지 않나요?

커피는 블랙이나 설탕만..

답글

Re: 뭐냐이게

글쓴이: FlOw / 작성시간: 월, 2005/08/08 - 10:12오후

우하하 wrote:

만들려면 좀 성의있게 만들던가
뭐냐이게
프로그래밍 공부한다고 하지도 마라.

며칠전에 다음만화의 URL들이 바뀌었습니다.
그렇게 싸잡아 비난하지 말고 무엇이 성의없는지 먼저 말해주시죠. :evil:

-------------------- 절취선 --
행복하세요:)

답글

BeautifulSoup

글쓴이: atie / 작성시간: 화, 2005/08/09 - 5:34오후

Python이면, BeautifulSoup을 강추. 저도 예제 삼아 해보았는데 참고가 될런지. :oops:

#!/usr/bin/env python
#ythumbs.py
# -*- coding: utf-8 -*-

import urllib
from BeautifulSoup import BeautifulSoup

# define variables
thumbs = []

def get_thumbs():
    URL = "http://kr.image.search.yahoo.com/search/images?b="
    NAME = "&p=%B1%E8%C5%C2%C8%F1"
    TYPE = "&subtype=com&z=imgbox"
    n = 1
    while n <= 201:
        PAGE = str(n)
        stream = urllib.urlopen(URL+PAGE+NAME+TYPE)
        soup = BeautifulSoup(stream)
        for link in soup('img'):
            thumb = link.get('src', '')
            if thumb:
                if "thumb" not in thumb:
                    continue
                thumbs.append(thumb)
                print thumb
        n = n + 20        
           
def save_to_file():
    file = "ythumbs.html"
    fp = open(file, "w")
    fp.write("<html>\n<body>\n<br/>\n<table>\n<tr>\n")
    i = 1
    for thumb in thumbs:
       if i < 9: 
           fp.write("<td><img src='"+thumb+"'></td>\n")
           i = i + 1
       else:
           fp.write("</tr>\n<tr><td><img src='"+thumb+"'></td>\n")
           i = 1
    fp.write("</tr>\n</table></body>\n</html>\n")
    fp.close()
    print "Saved ./" + file, "..."       
   
if __name__ == '__main__':
    get_thumbs()
    save_to_file()

----
I paint objects as I think them, not as I see them.
atie's minipage

답글

Re: 뭐냐이게

글쓴이: jachin / 작성시간: 화, 2005/08/09 - 8:05오후

우하하 wrote:

만들려면 좀 성의있게 만들던가
뭐냐이게
프로그래밍 공부한다고 하지도 마라.

답변이 더 성의없네요. -_-;

요즘엔 '성의'라는 말의 뜻을 잘 모르나 봅니다. 노력하고 정성을 쏟아 부어 만들어주신 소스에 '성의 없다'는 표현을 쓰다니, 뭔가 착각하는거 아닐까요? :evil:

답글

댓글 달기

이름

제목

댓글 *

텍스트 포맷에 대한 자세한 정보

텍스트 양식

Filtered HTML

텍스트에 BBCode 태그를 사용할 수 있습니다. URL은 자동으로 링크 됩니다.
사용할 수 있는 HTML 태그: <p><div><span><br><a><em><strong><del><ins><b><i><u><s><pre><code><cite><blockquote><ul><ol><li><dl><dt><dd><table><tr><td><th><thead><tbody><h1><h2><h3><h4><h5><h6><img><embed><object><param><hr>
다음 태그를 이용하여 소스 코드 구문 강조를 할 수 있습니다: <code>, <blockcode>, <apache>, <applescript>, <autoconf>, <awk>, <bash>, <c>, <cpp>, <css>, <diff>, <drupal5>, <drupal6>, <gdb>, <html>, <html5>, <java>, <javascript>, <ldif>, <lua>, <make>, <mysql>, <perl>, <perl6>, <php>, <pgsql>, <proftpd>, <python>, <reg>, <spec>, <ruby>. 지원하는 태그 형식: <foo>, [foo].
web 주소와/이메일 주소를 클릭할 수 있는 링크로 자동으로 바꿉니다.

BBCode

텍스트에 BBCode 태그를 사용할 수 있습니다. URL은 자동으로 링크 됩니다.
다음 태그를 이용하여 소스 코드 구문 강조를 할 수 있습니다: <code>, <blockcode>, <apache>, <applescript>, <autoconf>, <awk>, <bash>, <c>, <cpp>, <css>, <diff>, <drupal5>, <drupal6>, <gdb>, <html>, <html5>, <java>, <javascript>, <ldif>, <lua>, <make>, <mysql>, <perl>, <perl6>, <php>, <pgsql>, <proftpd>, <python>, <reg>, <spec>, <ruby>. 지원하는 태그 형식: <foo>, [foo].
사용할 수 있는 HTML 태그: <p><div><span><br><a><em><strong><del><ins><b><i><u><s><pre><code><cite><blockquote><ul><ol><li><dl><dt><dd><table><tr><td><th><thead><tbody><h1><h2><h3><h4><h5><h6><img><embed><object><param>
web 주소와/이메일 주소를 클릭할 수 있는 링크로 자동으로 바꿉니다.

Textile

다음 태그를 이용하여 소스 코드 구문 강조를 할 수 있습니다: <code>, <blockcode>, <apache>, <applescript>, <autoconf>, <awk>, <bash>, <c>, <cpp>, <css>, <diff>, <drupal5>, <drupal6>, <gdb>, <html>, <html5>, <java>, <javascript>, <ldif>, <lua>, <make>, <mysql>, <perl>, <perl6>, <php>, <pgsql>, <proftpd>, <python>, <reg>, <spec>, <ruby>. 지원하는 태그 형식: <foo>, [foo].
You can use Textile markup to format text.
사용할 수 있는 HTML 태그: <p><div><span><br><a><em><strong><del><ins><b><i><u><s><pre><code><cite><blockquote><ul><ol><li><dl><dt><dd><table><tr><td><th><thead><tbody><h1><h2><h3><h4><h5><h6><img><embed><object><param><hr>

Markdown

다음 태그를 이용하여 소스 코드 구문 강조를 할 수 있습니다: <code>, <blockcode>, <apache>, <applescript>, <autoconf>, <awk>, <bash>, <c>, <cpp>, <css>, <diff>, <drupal5>, <drupal6>, <gdb>, <html>, <html5>, <java>, <javascript>, <ldif>, <lua>, <make>, <mysql>, <perl>, <perl6>, <php>, <pgsql>, <proftpd>, <python>, <reg>, <spec>, <ruby>. 지원하는 태그 형식: <foo>, [foo].
Quick Tips:
- Two or more spaces at a line's end = Line break
- Double returns = Paragraph
- *Single asterisks* or _single underscores_ = Emphasis
- **Double** or __double__ = Strong
- This is [a link](http://the.link.example.com "The optional title text")
For complete details on the Markdown syntax, see the Markdown documentation and Markdown Extra documentation for tables, footnotes, and more.
web 주소와/이메일 주소를 클릭할 수 있는 링크로 자동으로 바꿉니다.
사용할 수 있는 HTML 태그: <p><div><span><br><a><em><strong><del><ins><b><i><u><s><pre><code><cite><blockquote><ul><ol><li><dl><dt><dd><table><tr><td><th><thead><tbody><h1><h2><h3><h4><h5><h6><img><embed><object><param><hr>

Plain text

HTML 태그를 사용할 수 없습니다.
web 주소와/이메일 주소를 클릭할 수 있는 링크로 자동으로 바꿉니다.
줄과 단락은 자동으로 분리됩니다.

CAPTCHA

이것은 자동으로 스팸을 올리는 것을 막기 위해서 제공됩니다.

부 메뉴

Daum 만화 ㅤㄲㅡㄺ어 보기