This is a preview version of Cyotek's blog and may be missing functionality and/or unstable. Please visit https://www.cyotek.com/blog for the current version of the blog.
If you encounter any problems using this preview site, please contact us with the details.

Posts tagged with 'Cyotek WebCopy'

WebCopy 1.8 - JavaScript Support

Cyotek WebCopy 3 Comments

One of the long standing requests/complaints is for WebCopy to support JavaScript enabled websites, e.g. modern SPA's where JavaScript is used to build the page. Traditionally this is something I have always put onto the furthest of back burners as in order to support this natively I'd have to essentially write half a browser, something that would be a full time job and a half and not something I'm interested in doing. Other solutions did exist but I never really looked into them.

It recently occurred to me however, that I'd put into place all the building blocks I needed to have WebCopy support JavaScript execution (in a limited fashion, more on this later) using Internet Explorer. And it was easy, in fact, the hardest part was sorting out threading issues - despite the fact that WebCopy currently only crawls on a single thread, it does run on a different thread to the UI in order not to freeze it, which COM can have a problem with.

Read More

WebCopy 1.8 - New Project Wizard

Cyotek WebCopy 0 Comments

In my previous post regarding WebCopy 1.8, I briefly covered a general grab-bag of some of the new features in this version. This post is dedicated to another new feature, the New Project Wizard.

Whilst you can still create a new blank project as with previous version of WebCopy, there's also a new GUI that will ask a series of questions and create a neatly configured project.

Read More

Introducing WebCopy 1.8

Cyotek WebCopy 0 Comments

It's been over two months since the last CI build of WebCopy was made, and during this time I've been working quite hard on some major internal refactoring and adding a long requested feature. It hope it's worth the wait, I need a break!

WebCopy 1.8 nightly builds are now available for download and so this series of posts will describe some of the changes and new functionality that have been made to the software. This first post will cover a grab bag of smaller changes.

Read More

WebCopy 1.7 - local file name generation

Cyotek WebCopy 0 Comments

As part of WebCopy 1.7's mission to reduce user confusion and make the product more appealing, a pair of new options for controlling local file name generation have been introduced, as well as correcting a potentially confusing bug.

By default, WebCopy will name local files to match their content type. For example, if you download the homepage of a website which is named index.php, WebCopy will save a local file named index.html - end users would probably very confused trying to open a .php file and either the operating system doesn't know how to handle it, or it executes the PHP runtime.

Read More

WebCopy 1.7 - tls/ssl invalid certificate handling

Cyotek WebCopy 1 Comments

You should add an option to ignore checking for an SSL certificate.

The above quote is the last part of a piece of uninstallation feedback I received about WebCopy on Friday. This isn't the first time I've had an anonymous feedback about ignoring SSL errors and each time it has happened it has been frustrating and even bewildering as the option is already there and has been since 2013!

Read More

WebCopy 1.7 - web browser authentication

Cyotek WebCopy Cyotek Sitemap Creator 0 Comments

There are five main features WebCopy (and Sitemap Creator) need based on user feedback and our own observations. In no particular order, these are making the product easier to use, supporting multiple downloads at once, being able to pause and resume a copy, JavaScript support and authentication. The current plan is to address three of the five in WebCopy 1.7, starting with authentication.

Since the earliest days of WebCopy, it has supported challenge authentication (where a web browser prompts you for credentials) and form based authentication (where you enter credentials into a web page). Almost all web sites use the latter approach and with WebCopy this can either be tricky to configure or impossible due to websites using interactive methods such as authenticators or captcha codes.

Read More

WebCopy 1.4 beta released

Cyotek WebCopy

A beta version of WebCopy 1.4 complete with a fundamental change to how rules are ran, various performance improvements, UI tweaks and miscellaneous bug fixes has been released.

In previous versions of WebCopy, rule processing would stop as soon as the first rule was matched. This made it impossible to do standard tasks like exclude all HTML pages from being downloaded (but still scan them) and only download image resources, as an example.

Read More

Transforming hyperlinks when copying websites

Cyotek WebCopy 0 Comments

Recently a website I infrequently use was badly defaced, and in the course of repairing the damage the owners of the site temporarily took it down. As I found it to be a very useful resource I lamented not having an offline copy and so when the site was restored, I decided to make a copy without further ado.

However, as I swiftly discovered, that was a problem - the site used JavaScript for many internal links, and WebCopy doesn't support JavaScript. Somewhat fortunately, when I looked at how the JavaScript links functioned, I discovered they were all of a predicable nature - a call to a single function with two string arguments. The destination URL was a simple concatenation of these arguments with no extra processing.

Read More

srcset attribute support, custom attributes, 300 status support and more

Cyotek WebCopy 0 Comments

A new beta version of WebCopy has been released, containing a range of features and bug fixes.

If you're finding WebCopy useful, please donate to keep the project alive

Read More

On WebCopy, Continuous Integration, .NET Framework 4.5 and end of Windows XP support

Cyotek WebCopy Cyotek Sitemap Creator 0 Comments

This is quite a long post so I'm just going to add an important bit of news here - WebCopy 1.1 will not be able to be installed or ran on Windows XP

WebCopy, like most Cyotek products, is built in C# using Microsoft .NET Framework 3.5, thus allowing it to run on Windows XP onwards. Each time the product is built, a batch file is manually ran which goes away and compiles the solution, signs the files, does some "deployment ready" checks, generates the documentation and then generates the setup. Tests are not run as part of this process as generally they are always running in the IDE via NCrunch.

Read More