Murdoch University Research Repository

Welcome to the Murdoch University Research Repository

The Murdoch University Research Repository is an open access digital collection of research
created by Murdoch University staff, researchers and postgraduate students.

Learn more

The dangers of webcrawled datasets

Bell, G.B. (2010) The dangers of webcrawled datasets. First Monday, 15 (2).

PDF - Published Version
Download (172kB)


This article highlights legal, ethical and scientific problems arising from the use of large experimental datasets gathered from the Internet — in particular, image datasets. Such datasets are currently used within research into topics such as information forensics and image processing. This paper strongly recommends against Webcrawling as a means for generating experimental datasets, and proposes safer alternatives.

Item Type: Journal Article
Murdoch Affiliation: School of Information Technology
Publisher: University of Illinois
Copyright: Creative Commons Attribution 2.5 UK: Scotland License
Notes: “The dangers of Webcrawled datasets” by Graeme Bell is licensed under a Creative Commons Attribution 2.5 UK: Scotland License. Permissions beyond the scope of this license may be available at
Item Control Page Item Control Page


Downloads per month over past year