The search engine uses the electronic versions of the 35 volumes (containing entries from A to ROWNY) of the dictionary of the 16th century Polish, made succesively available since 2008 in the DjVu format in the Kujawsko-Pomorska Digital Library under the terms of the Creative Commons license version Attribution-Noncommercial-No Derivative Works 2.5 Poland.
Scanning the first 32 volumes, and later OCR with LizardTech DocumentExpress Professional 5.0, has been done by the library staff.
Volume XXXIII and XXXIV are digitally born, they have been converted to DjVu format by Jakub Wilk using the pdf2djvu and minidjvu programs.
On July 14, 2010 the digitally born versions of volumes XXXI and XXXII have replaced the scans; they have been converted to DjVu format by Jakub Wilk using the pdf2djvu and minidjvu programs.
Unfortunately using the digitally-born versions of some other volumes is not possible for formal reasons described in Janusz S. Bień's paper.
The volume XXXV is the first one prepared with the intention to distribute it also in an online form due to the requirement of the sponsor, i.e. Foundation for Polish Science. It has been converted to DjVu format using the pdf2djvu program by the staff of the KPBC library. Unfortunately in this case such an online form is not optimal, cf. e.g. the relevant fragments of Krzysztof Szafran's book (in Polish) and its review.
The OCR results for the scanned volumes and the hidden text layer of volumes XXXI-XXXIV have been converted to the suitable corpus format by Jakub Wilk. The volume XXXV has been converted to the suitable format and added to the corpus by Krzysztof Szafran.
The corpus consists of ca. 33 million segments. This is the version 6 of the corpus, available since March 20, 2012.
Search can be limited to a specific volume with the meta clause, e.g. meta vol=i/X
or meta vol=I/X
limits the search to the first volume while meta orig=pdf
limits the search to the 5 digitally-born volumes.
The within
clause can be used to limit the search to the so called sections:
front
(the frontmatter),
intro
(the prefaces in the first volume),
list
(the list of entries provided at the beginning of every volume),
body
(the entries),
errata
(corrections and additions; this section is contained in body
),
back
(backmatter),
inset
(loose insets with the lists of sources and abbreviations).