Ir al contenido principal

Ralsina.Me — El sitio web de Roberto Alsina

Capturing a webpage as an image using Pyhon and Qt

En­ton­ces googleé y me en­contré con Cu­ty­Capt que usa Qt y We­bKit pa­ra con­ver­tir pá­gi­nas web en imá­ge­nes. ¡Me sir­ve!

Co­mo quie­ro usar­lo des­de una apli­ca­ción Py­Q­t, tie­ne sen­ti­do ha­cer lo mis­mo que Cu­ty­Capt ha­ce, pe­ro des­de un mó­du­lo py­thon así que acá es­tá una im­ple­men­ta­ción ra­pi­di­ta que fun­cio­na pa­ra mí, un­que ca­re­ce de mu­chos fea­tu­res de Cu­ty­Cap­t.

Con un po­co más de es­fuer­zo, pue­de guar­dar co­mo PDF o SV­G, lo que per­mi­ti­ría usar­la ca­si co­mo una pá­gi­na web de ver­da­d.

Se usa así:

python  capty.py http://www.kde.org kde.png

Y acá es­tá el có­di­go [des­car­gar cap­ty.­py]

# -*- coding: utf-8 -*-

"""This tries to do more or less the same thing as CutyCapt, but as a
python module.

This is a derived work from CutyCapt: http://cutycapt.sourceforge.net/

////////////////////////////////////////////////////////////////////
//
// CutyCapt - A Qt WebKit Web Page Rendering Capture Utility
//
// Copyright (C) 2003-2010 Bjoern Hoehrmann <bjoern@hoehrmann.de>
//
// This program is free software; you can redistribute it and/or
// modify it under the terms of the GNU General Public License
// as published by the Free Software Foundation; either version 2
// of the License, or (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
// GNU General Public License for more details.
//
// $Id$
//
////////////////////////////////////////////////////////////////////

"""

import sys
from PyQt4 import QtCore, QtGui, QtWebKit


class Capturer(object):
    """A class to capture webpages as images"""

    def __init__(self, url, filename):
        self.url = url
        self.filename = filename
        self.saw_initial_layout = False
        self.saw_document_complete = False

    def loadFinishedSlot(self):
        self.saw_document_complete = True
        if self.saw_initial_layout and self.saw_document_complete:
            self.doCapture()

    def initialLayoutSlot(self):
        self.saw_initial_layout = True
        if self.saw_initial_layout and self.saw_document_complete:
            self.doCapture()

    def capture(self):
        """Captures url as an image to the file specified"""
        self.wb = QtWebKit.QWebPage()
        self.wb.mainFrame().setScrollBarPolicy(
            QtCore.Qt.Horizontal, QtCore.Qt.ScrollBarAlwaysOff)
        self.wb.mainFrame().setScrollBarPolicy(
            QtCore.Qt.Vertical, QtCore.Qt.ScrollBarAlwaysOff)

        self.wb.loadFinished.connect(self.loadFinishedSlot)
        self.wb.mainFrame().initialLayoutCompleted.connect(
            self.initialLayoutSlot)

        self.wb.mainFrame().load(QtCore.QUrl(self.url))

    def doCapture(self):
        self.wb.setViewportSize(self.wb.mainFrame().contentsSize())
        img = QtGui.QImage(self.wb.viewportSize(), QtGui.QImage.Format_ARGB32)
        painter = QtGui.QPainter(img)
        self.wb.mainFrame().render(painter)
        painter.end()
        img.save(self.filename)
        QtCore.QCoreApplication.instance().quit()

if __name__ == "__main__":
    """Run a simple capture"""
    app = QtGui.QApplication(sys.argv)
    c = Capturer(sys.argv[1], sys.argv[2])
    c.capture()
    app.exec_()
Chris / 2012-12-21 18:59:

Thanks! This was really helpful.

For anyone else who wanted to pass in min-width and min-height...just accept them as parameters and in doCapture set the viewport size manually: self.wb.setViewportSize(QtCore.QSize(1024,768))


Contents © 2000-2023 Roberto Alsina