When almost a quarter of a century ago I got involved in my first real corpus creation task, the state of technological development, not only in our country but worldwide, was such that everything concerning the processes of collection, annotation and tagging of electronic databases as well as development of software tools for their processing sounded like talk taken from some science fiction novel or movie. To have millions of words in electronic form and be able to process them seemed, as Sinclair (1991a: 1) put it, almost lunatic. Nowadays with the availability of large electronic libraries of books, with the easy access to electronic editions of newspapers, magazines, journals, blogs and social media, and with almost every piece of writing coming first in electronic form before it is printed, it is hard to believe that back then we had to do the whole corpus creation and compilation work from scratch rather than just copy and paste ready-made textual material.
It should be noted, however, that the first steps made in the 1990s were faltering and difficult not so much because of the limited capabilities of computers to store and process large amounts of language, nor were they such because there were very few texts in electronic form. These inconveniences were compensated by the necessary amount of enthusiasm and devotion, and by many more hours of hard manual labour on the design criteria, collection and digitization of the data than anyone would put in this kind of activity nowadays. All the obstacles were easily overcome by our belief that the effort was worth it, because we realized that the future of linguistic research and practice, reflecting verbal and written communication, could not remain isolated from the processes of globalization and technological development.