Capturing a webpage as an image using Pyhon and Qt
For a small project I am doing I wanted the capability to see web pages offline. So, I started thinking of a way to do that, and all solutions were annoying or impractical.
So, I googled and found CutyCapt which uses Qt and WebKit to turn web pages into images. Good enough for me!
Since I wanted to use this from a PyQt app, it makes sense to do the same thing CutyCapt does, but as a python module/script, so here's a quick implementation that works for me, even if it lacks a bunch of CutyCapt's features.
With a little extra effort, it can even save as PDF or SVG, which would let you use it almost like a real web page.
You just use it like this:
python capty.py http://www.kde.org kde.png
And here's the code [download capty.py]
# -*- coding: utf-8 -*-
"""This tries to do more or less the same thing as CutyCapt, but as a
python module.
This is a derived work from CutyCapt: http://cutycapt.sourceforge.net/
////////////////////////////////////////////////////////////////////
//
// CutyCapt - A Qt WebKit Web Page Rendering Capture Utility
//
// Copyright (C) 2003-2010 Bjoern Hoehrmann <bjoern@hoehrmann.de>
//
// This program is free software; you can redistribute it and/or
// modify it under the terms of the GNU General Public License
// as published by the Free Software Foundation; either version 2
// of the License, or (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// $Id$
//
////////////////////////////////////////////////////////////////////
"""
import sys
from PyQt4 import QtCore, QtGui, QtWebKit
class Capturer(object):
"""A class to capture webpages as images"""
def __init__(self, url, filename):
self.url = url
self.filename = filename
self.saw_initial_layout = False
self.saw_document_complete = False
def loadFinishedSlot(self):
self.saw_document_complete = True
if self.saw_initial_layout and self.saw_document_complete:
self.doCapture()
def initialLayoutSlot(self):
self.saw_initial_layout = True
if self.saw_initial_layout and self.saw_document_complete:
self.doCapture()
def capture(self):
"""Captures url as an image to the file specified"""
self.wb = QtWebKit.QWebPage()
self.wb.mainFrame().setScrollBarPolicy(
QtCore.Qt.Horizontal, QtCore.Qt.ScrollBarAlwaysOff)
self.wb.mainFrame().setScrollBarPolicy(
QtCore.Qt.Vertical, QtCore.Qt.ScrollBarAlwaysOff)
self.wb.loadFinished.connect(self.loadFinishedSlot)
self.wb.mainFrame().initialLayoutCompleted.connect(
self.initialLayoutSlot)
self.wb.mainFrame().load(QtCore.QUrl(self.url))
def doCapture(self):
self.wb.setViewportSize(self.wb.mainFrame().contentsSize())
img = QtGui.QImage(self.wb.viewportSize(), QtGui.QImage.Format_ARGB32)
painter = QtGui.QPainter(img)
self.wb.mainFrame().render(painter)
painter.end()
img.save(self.filename)
QtCore.QCoreApplication.instance().quit()
if __name__ == "__main__":
"""Run a simple capture"""
app = QtGui.QApplication(sys.argv)
c = Capturer(sys.argv[1], sys.argv[2])
c.capture()
app.exec_()
Thanks! This was really helpful.
For anyone else who wanted to pass in min-width and min-height...just accept them as parameters and in doCapture set the viewport size manually: self.wb.setViewportSize(QtCore.QSize(1024,768))