Thursday, 19 April 2018

Talk to the Scholar (Book)

I have worked on more than fascinating projects this term (besides teaching and administrative duties), all of which may deserve a different post. We worked pretty much with more down than ups on re-establishing Digital Humanities MA programmes in Hungary. At the moment though I do not have a clue about the outcome of these efforts, the documents are in the ministry to be decided upon. I am working on the Hungarian Shakespeare Archive which fills me with joy, though sometimes I am not sure whether the time and energy I invest into this project are useful to anybody. I worked on the board of two Digital Humanities journals, a Hungarian one (Digitális Bölcsészet), and the other more international (Digital Scholar) and reviewed articles for both. Also, I had the opportunity to take part, teach four classes at the Text Analysis Across Disciplines Boot Camp at CEU. For the sake of advertising Digital Humanities in Hungary, I wrote up a longish Wikipedia entry about Digital Humanities. Furthermore, I am working on an online course focusing on Digital cultural memory to be finished by September the latest.

All these are projects that I just enjoy immensely, but all these would like to line up into a more ambitious project, i.e. improving academic life. This terribly, horribly, frighteningly ambitious project consists at the moment in two distinct subprojects. One of them is automating everything that is possible in a literary scholar's job, while the other is understanding and thus making education at an English department more meaningful. To put these more bluntly, I am lazy enough to let the machine do what it is better at than me, and easing my job in a way that I can tell my students why it is beneficial for them to attend my (or for that matter anybody's) classes.

When daydreaming about this ambitious project, I keep looking at the world what other people, teams are doing in this area. Keeping an eye on these is pretty easy with Twitter and RSS feed. The next book on my reading list, for example, is the most inspiring Cathy N. Davidson's new book The New Education: How to Revolutionize the University to Prepare Students for a World in Flux (New York: Basic Books, 2017), which I came across via my Feedly. The other finding is the Talk to Books project announced 5 days ago on the Google research blog (THX, Feedly again). And it is this project that I would like to write about now, as it nicely fits the scholarly aspect of the dream project.

Talk to Books may well give a hand in research if it manages to improve diligently, and there seems to be every chance for this. Talk to Books is a project within Google Books and it promises a semantic search engine. What Talk to Books does is rather fancy: You ask a question, the machine makes sense of the question and searches 100.000 books at the moment and tries to answer the question by leading the researcher to books wherein the answer lies, and highlights the sentence in the book which seems to answer the question. This seems to be similar to WolframAlpha insomuch as semantic search is concerned, and similar to Understanding Shakespeare, the Folger Shakespeare Library and JSTOR cooperation for both are to help scholars with gathering secondary sources. What differentiates the Talk to Books project from  WolframAlpha is that the latter provides information, while the former provides information that is documented. And Talk to Books is more sophisticated than the Folger and JSTOR collaboration to the extent that there is an element of a communicative situation in it. Of course, the communicative situation is in a way a fake one, the machine does not understand the question as a human being would, and the answers are sometimes completely off the track, but the method of faking communication works pretty well.

The model underlying Talk to books relies on Word vectors, a statistically trained model of relating meaning to strings of letters by analysing the context in which words occur. The model, in this case, is trained on the contexts provided by the natural language used and includes a highly complicated set of testing, curation of verbal contexts, filtering mistaken contexts (noise) and reducing the examples to relevant verbal contexts. The sets of code that is under the hood of Talk to Books is Google's machine learning toolkit, Tensorflow. More about this can be found at TensorFlow tutorials.

Now tasting is the ultimate test of this pudding, so let us see how Talk to Books works. I am writing a paper about spectatorship, so it might be interesting to check if Talk to Books may come in handy here. After some trials and specifications -- the user should also adapt to the abilities of the machine -- I came up with this question "what is spectatorship in a theatre?" After hitting the search icon approximately 15 books and quotations from these books and links to these books in Google Books showed up on the screen. Out of these books, seven were closely related to theatre studies and I found rather relevant quotations. Three of the books centred on spectatorship in the cinema, which is understandable as spectatorship studies are closely linked to the movie, and even these books referred to directly or indirectly to the theatre, so these would not be irrelevant either. The rest of the books referred to spectatorship in divers contexts, such as social research, folklore studies, discourse analysis, rhetoric.

The results of this simple search are telling on three accounts. First, the results seem to be rather relevant, so the word vector technology lying at the backend of deep learning technologies in general, and Tensorlfow, in particular, seems to be promising. Second, even the irrelevant hits may well prove beneficial, because they help one look out of the box, bump into scholarly findings semantically but not discipline-wise related to one's research. Thus, if I intend to be really generous, I should admit that the search engine facilitates interdisciplinary studies as well. Third, Talk to Books may well ease and help the scholar's tasks: it is easy to copy and paste the relevant quotation, one can check the context of the quotation via the link to the entire book, or rather to a page in Google Books, get hold of the bibliographical data of the book.

Although Talk to Books is promising I can see room for development in three steps. Talk to Books as it is now, does not have a specific target audience. Judging by these first impressions it seems to me that the target audience is the educated, English-speaking community of intellectuals. This wide user set is understandable from the perspective of the developers, since they need statistically relevant results to test the application. From the scholarly user's point of view though, the target audience should be the scholarly community, and thus the linguistic behaviour of the scholarly community should be more relevant for the textual corpus used for training Tensorflow. Second, again from the scholarly community's perspective harvesting metadata is still more laborious at the moment than it could be. If one intends to use, say, Zotero, one has to click many times, i.e. one has to go to the page in Google Books, find the "information about this book" link, search for the ISBN number, paste it into Zotero. Instead of these numerous clicks, one click would be better... Third, some filtering methodology would help the scholarly user on the one hand and a wider corpus including journals would come in handy, provided the application is to serve scholars. OK, I understand that Talk to BOOKS is about books, but scholars use journals as often as edited volumes or monographs.

In conclusion, I am just overwhelmed by the Talk to Books project. I am overwhelmed, because I can see my dream project, i.e. automating whatever can be automated in scholarly work, come true with this project, or at least one significant aspect of this. I am overwhelmed because I find in this project a promising use of deep learning technologies in ways that are already beneficial. And whatever misgivings I have concerning Google, there are amazing people in their ranks who can and do shape our digital futures.

Wednesday, 25 October 2017

Conference, OJS

Yesterday (24 Oct.) I attended a brilliant one-day conference about Modern Platforms for the Publication of Journals. Before all the important ideas I came across there sink into oblivion, let me jot them down. I am doing this in the hope that I may use these ideas later on, and that you will find them beneficial, too. As memories fade into darkness soon, and I had no pen or pencil with me yesterday, I used twitter for sharing and keeping the best ideas, and I will use my tweets as helping hands in the act of recollection.

First and foremost the conference focused on the journals managed by the publishing house of the Hungarian Academy of Sciences, which publishing house is owned by Kluwer. Now, Kluwer is a profit oriented Dutch enterprise, so let me share with you my dissatisfaction with the fact that the Hungarian Academy of Sciences as a publisher does not have financial and scientific freedom but is dependent on a private organization. Anyways, this is what life is, let us move on towards more productive ideas. Or no, let me play on a bitter note, the presentation was about what value a publisher adds to the process of publishing, and I am quite convinced that the work flow presented is needed and is necessary. On the other hand, however, during the QA session the answer to the question about payment for the authors and peer reviewers was just shocking: 1-3-month access to the journal free of charge. Please!


The conference implicitly and the speakers explicitly were meditating about the Open Journal Systems (OJS from now on), which is an open access and open source platform for the management of submissions to and publication of a journal. The wikipedia article on OJS is rather informative. All the speakers agreed on its usefulness from different angles. Some speakers shared their experience as far as impressions were concerned: both authors and reviewers found OJS as something that added value and significance to the journal, maybe the journal itself seemed more professional. Others mentioned that the management of submissions is awesome, it is very difficult to make mistakes, submissions do not disappear, every step of the editorial process can be tracked. Another speaker (Andrea Horváth) claimed that since they had started using the OJS the number of submissions doubled, tripled. The downside of OJS is the learning curve, which is steep and occasionally the editors needed some help when facing the then seemingly irresolvable problems.

Another advantage of OJS is its flexibility and compatibility with other applications, services. It is rather beneficial that OJS can seamlessly work with DOI (Erika Bilicsi). A Digital Object Identifier (aka DOI) is necessary for the sake of relating each paper of the journal to an, say, ID card, i.e. in the cyberspace with a DOI that will provide an “everlasting” identity to the digital object. Metadata, such as a URL, are linked to the object, and while the metadata may change, though should be updated, the DOI does not. For more details visit the DOI webpage and the relevant Wikipedia article.

Another speaker (László Peregovits) argued that an ORCID is very important, too, and it works nicely with OJS, too. If the DOI is used to identify a digital object, such as a journal article, the ORCID is the identifier of the researcher, author. The ORCID can be used when submitting an article, is useful to link academic activity to a researcher. This helps visibility for the author and visibility for the products of research like research articles. If you are interested in more details about ORCID check out the ORCID Wikipedia entry. Furthermore the same speaker claimed that they only allow authors with an ORCID to submit manuscripts. The creation of an account at orcid.org, and thus obtaining an ORCID is not a big issue, it took me 2 minutes, surely adding publications etc. is the more laborious part. This seems to be a worthy project, so I’ll sooner or later provide my data there as well.

During the conference I also found that the people at HAS Library are really nice and helpful. This is rather reassuring on two accounts. First, because I am a member of the editorial board of a new journal, which is run on OJS. I can tell that the learning curve is steep and one needs help occasionally. Secondly, because if I happen to try to convince PPCU to use this fantastic platform for our digital journals, then it is good to know that there are helpful people out there who can and are willing to help.

All in all, the conference was absolutely inspiring. I learned a lot including the significance of DOI, ORCID, OJS and some trends in terms of journals, platforms and publication. I hope I may make use of this knowledge in the near future especially in ways that foster Open Access publications, after all this is Open Access Week.

Monday, 23 October 2017

János Arany and Europeana

This post is dedicated to the János Arany jubilee year in Hungary and to Europeana. The two topics will be linked insomuch as I will explore what an interested reader may find in Europeana about / from János Arany and what (s)he can do with the findings.

https://upload.wikimedia.org/wikipedia/commons/0/0a/Barabas-arany.jpg
Miklós Barabás: János Arany
Anon. photograph
The jubilee year (200th anniversary of his birth) celebrates János Arany as one of the greatest Hungarian poets, scholars, translators in the 19th Century. Arany’s poetry is taught in schools, is recited in theatres, is published again and again in popular and scholarly publications. He is also celebrated as a fantastic translator, who rendered from Greek drama to Shakespeare a great variety of authors into Hungarian. As far as translations are concerned he was not only a awe-inspiring master of the Hungarian language but also did much for the translation of the entire Shakespearean dramatic oeuvre into Hungarian as the president of the Hungarian Shakespeare Committee. The Committee was brought into being as a working committee of the Kisfaludy Society and was responsible for the publication of the first complete Shakespeare (1864-1878). Arany himself translated the Midsummer Night’s Dream, King John and Hamlet for the project, which translations became iconic, sacred translations of these plays and remained so until the end of the 20th century. A more detailed article on Arany can be found in Wikipedia.

My screen image -- ZSA.
Now, what can be found about Arany in Europeana? When searching for the name “János Arany” the user is given 856 hits, most of which are relevant and a few to be eliminated, as they are about contemporaries and other topics, people, objects. Then I used the awesome filtering option of the portal (gateway) to narrow down the search for images and the machine found 435 image files. I have checked them first by 12 images per page. I could find this way interesting portraits about him from a variety of ages and purposes. Some of them represent him as seated in Hungarian traditional garments or just suits with his unmistakeable large moustache, some of them show him as a middle-aged gentleman, some as an old man. Most of the images cannot be included in this post (unfortunately mostly Hungarian), as their copyright statement claimed that permission is to be requested from the institution owning the object.


Arany's statue in front of National Musuem
Maybe I am too lazy, or maybe I represent the majority of users, I did not have the time and energy to request the permission so for the sake of remaining on the safe side I will not include them in this post. The only image that I have found useable from the Hungarian collections can be seen here: “Statue of János Arany in front of the Hungarian National Museum”. The significance of this photo is that it fosters Arany’s iconic status as the statute is located in front of one of the most important Hungarian sites of memory.

Museo Postal y Filatélico de Barcelona
Then I checked images located originally in European countries outside Hungary.
Arany's armchair
Having found digital objects in the UK (Bodleian Library) and in Germany and Spain (this latter is on the left), I would like to mention only the ones from Romania. These images are of objects in Arany’s memorial museum located in Szalonta, Romania. These images can be freely used (OK, NC-ND licence) so you can find some of them here as well. I am not quite sure that Arany actually used these items, whether he sat in this very chair, and if yes, when in his life, whether he used the inkwell below.



Arany's inkwell
Though I am not quite sure about the intimate relationship between Arany and the objects, I am rather positive that these and the rest of the items in Europeana are worth calling attention to. And also that Arany himself is worth remembering as an outstanding poet, a scholar and translator not only in Hungary but in Europe as well. He was a man of great talents and used these talents to culturally integrate and link Hungary to Europe, Europe to Hungary. Both the jubilee year and Europeana remind us of the great cultural heroes in Europe, and present models to follow. Happy Birthday, János Arany!

Friday, 2 December 2016

Obituary József Gedeon

József Gedeon (Igor Grín 2008)
József Gedeon, manager of the Castle Theatre in Gyula, Hungary, organizer of the Shakespeare Festival (Gyula), member and founder of many Hungarian and international art associations, died Nov. 25. He was 60 years old.

József Gedeon was born in Gyula, spent most of his life in his hometown with the exception when he studied in Szeged, where he obtained his degree in Comparative Literature, and in Budapest to study Art and Design Management at Moholy-Nagy University of Art and Design. When in his hometown he worked as an extra and then as an actor in theatrical productions, was a teacher of Literature and English, was the head of the Cultural Department of the Local Government, and became the director of the Castle Theatre in 1995. As the director of the Theatre he organized the annual Summer Festival, part of which was the Shakespeare Theatre Festival. He also brought into being an International Jazz festival. He was the founding member of the Hungarian Erkel Ferenc Society, the Association of Outdoors Theatres, the Hungarian-British Friendship Society at Gyula, and member of the Steering Committee of the Hungarian Shakespeare Committee, initiator and founder of the European Shakespeare Festivals Network.

József Gedeon was a man of vision, charisma and uncompromising power. His achievements as a local cultural patriot, as a national and international figure of cultural life need no further commentary. I have known him since the refoundation of the Hungarian Shakespeare Committee, when he was invited to act as an active member of the board of the committee. Since then I have been in touch with him in a variety of capacities, mostly owing to events related to Shakespeare’s reception. He was full of energy and enthusiasm for whatever he was involved in: he was keen on giving a lecture on the history of the Hungarian Shakespeare Festival (he travelled 300 kms to Budapest and another 300 back home for this speech). Besides being passionate and knowledgeable about Shakespeare, the theatre, Gyula, he was always happy to listen to other people’s opinions, he was open to discussions concerning the conference during the Shakespeare Festival, but he was also able to disagree when he found reason for doing so.

His death does not solely fill everybody who knew him with remorse but also creates a vacuum in the Hungarian cultural, theatrical life, also in the Hungarian Shakespeare reception, a vacuum that can hardly be filled. A man of heart and steel has been lost, a man who was one of us and also above us, a man who made history, cultural history, a man we on every side of Shakespeare reception sorely miss.

Zsolt Almási
Secretary of the Hungarian Shakespeare Committee

Monday, 25 July 2016

Academic blogging: why?

Now that the summer is at its full swing, when being away from everyday bureaucratic work, thus having the freedom to ponder about stuff that normally is suppressed by daily duties, I started thinking about why I love blogging. Although I can easily list a hundred reasons why I should not write these blog posts, yet I just love musing about ideas that concern me most temporarily.

To begin with, let us see why it may seem counterproductive to spend time with writing up blog posts. First, blogging has no academic value, it does not count in one's list of publications, so it is a waste of time. Two, a blog post is not peer reviewed, so its contents may well be questionable. Three, this is at least a feature of my blog, that rather few people read it: it is not academic enough for my colleagues, and maybe too academic, at least topic-wise for others. And fourth, I do not publish blog posts regularly enough to attract readers. Surely this last one is a person specific problem, as I run this English and I also have a Hungarian one, the posts appear either here or there so the appearance of new posts are rather rare.

Though these counterarguments seem sound, I would still like to reflect on them. Of course these blog posts do not surface in the list of publications, and yet they are not completely valueless. The list of publications does not have a merit on its on, and I hope and believe that what is meant by scholarly value may change over time. But undeniably at the moment scholarly value remains a problem. Two, clearly the blog posts do not go through the process of peer review. Although peer review has its on discontents, I do not intend to rehears them here, first and foremost because I deem peer review a necessary and beneficial institution. But some sort of peer review is at work in case of blog posts too, even if not in the prepublication phase. Comments function as postpublication peer review, which is as important and relevant as the prepublication one. Three, the problematics of too few readers. I reckon not much more people read my other writings that are hidden behind the paywall, and a comment by Jonathan Hope means much more to me than many references by other scholars. Fourth the two-language blogging. Writing blog posts is really fun, and if it is fun in two languages, then let it be like that, I can live with maybe loosing readers because of the small number of posts per blog. Maybe in the future I will unite the two blogs, where both English and Hungarian posts will appear next to each other.

Refuting counterarguments provides insufficient reasons for blogging though. So why do I find so much fun in writing blog posts? One of the reasons is my fascination with Open Access. I reckon blogging is just contribution to the growth of Open Access content and ultimately to the cause of the Open Access movement, which is in a sense an end in itself for me. It also matters that I enjoy the process of writing up of shorter pieces. Sometimes I get tired of creating longer writings, joy is deferred so much that sometimes I sometimes get tired of that type of work. Writing up a blog post though gives much more immediate satisfaction, since I can finish a short, max. 1000-word long piece in an afternoon. And also there is no suffocating feeling of a must-do activity. Journal articles must be written, a book is under way, these are necessary parts of academic life, and I enjoy these too. But writing blog posts is really for joy: if I have an idea to verbalize in a post, and I have the free-time to work on it, then I enjoy myself this way without the pressure of a compulsory work. It is also significant that I just love putting ideas into words, as an academic and old fashioned humanist I believe in the power of words,that shape reality even if in the most remote sense.

Why I like blogging so very much is also due to the change in register. I appreciate the tense, academic style that addresses the initiate. But I also find pleasure in turning to a somewhat more colloquial style that shoots beyond the small circle of academics. This is one of the reasons why I contribute to Wikipedia with entries from my field. This doesn't mean though that I would have a clear notion of who reads my blogs. Most of the (small number of) comments on this English blog are from friends and academics, but I have no idea who reads this one without commenting. The Hungarian however is clearly read by non-academics as well, I have received comments from people I know to be outside the circles of the academia, and I am also aware of people reading the blog from all walks of life.

And a last reason lies in the fact that very few Hungarian scholars have a blog. Although there is a growing number of academics who write blog posts, still this medium is not so fashionable as it is in the Anglo-American world. There might be some cultural reasons for this difference, e.g. shyness, not so much inclination for writing etc. The cultural differences, however, do not hinder me from this activity but rather encourage me.

These are some of the reasons why I keep using the medium of blogs. They may sound weak for some people, from certain perspectives, yet they are sufficient for me at the moment.

Friday, 3 June 2016

Open Access and the new culture of information flow

I consider myself an advocate of the Open Access movement not only in words but in deeds as well. That said I have to admit or rather just because of being an advocate I must admit that I have some problems with Open Access publications. Namely, I am not quite sure that Open Access publications can reach their target audience as effectively as their counterparts behind the paywall.

Creative Commons
GNU
I consider myself an advocate of the OA movement, because whenever I have the opportunity, I speak about it. Surely, I can speak about the concept of OA in the greatest depth during my Digital Humanities classes. There I have the opportunity to elaborate on the difference between free and OA, about the various shades of OA (gold, green), the numerous licencing opportunities from GNU GPL to Creative Commons and the degrees within these, the Budapest Open Access Initiative. I frequently use the OA button in my browser. Furthermore, I also am happy to speak about Aaron Swartz and Alexandra Elbakyan, about The Internet's Own Boy, and SciHub. When reading out parts of the "Open Access Gerilla Manifesto," my voice betrays my emotional involvement, similarly to the moments when reciting Bertrand Russel's "Preface" to his Biography, or when reading out Lear’s words carrying Cordelia’s dead body on stage.

Swartz smiling
Aaron Swartz
Alexandra Elbakyan
Alexandra Elbakyan
Being an advocate of OA does not only involve talking about this fantastic concept, practice and responsibility of the Internet, but also I try to act accordingly, too. Running a blog is one step towards academic openness. With an English colleague we founded an OA journal, e-Colloquia, which is not alive at the moment but should / could be resurrected soon. I regularly contribute to Wikipedia, and request my students to do so within the framework of editathons, too. I share the PowerPoint and Prezi presentations for my classes on Slideshare and make them open on Prezi so that others can make use of them. I share my course descriptions so that anyone can copy and develop them. I am also happy to share my projects (scripts and texts) on GitHub so that anyone interested can copy, download or fork them. So I try to act according to what I preach.

That said I also have to share my problems with accessing OA objects in general and OA books in particular. The case is easy once I learn about an OA object or book: I only enter the relevant strings in the search window of the browser and Bob's my uncle. The problem arises when I just do not know or simply forget about, say an OA book. If I do not know anything about an OA book, then I will not be able to find it. Where is the problem here?—one might ask. Why would you look for something that you do not know if it exists at all? Yes, this is true and the very problem at the same time. I learn about books that are expensive, written by authorities in the field, counting as landmarks in the discipline, well before their publication, as news, would reach me very fast. Appetisers, i.e. academic advertisements would call my attention to them, and by the time of the publication, I would be eager to purchase and read them.

But this is not the case with OA books. Their authors do not mention their OA publications either during the pre-publication phase or after it, clearly because of shyness, or because fearing self-promotion, believing that a good book, article does not need advertisement, you name the reasons. The publishing house does not have an interest in advertising the OA publication, as advertisements need investment without return. Most of the time the funding for the OA publication does not include the cost of advertisements, thus beyond the fact that advertising OA publications is not in the best interest of publishers, funding authorities never think about this: their sole objective is to have the results of a research project published.


Should then the cost of advertisements be built into the research costs? Maybe. Or should a new academic culture of "care and share" be created? The digital arena does not only foster OA publishing but also provides ample opportunities to let colleagues know about one’s publications: they may be notified via personal emails and email lists, the books can be advertised through social media: Twitter, Facebook, Google+, Academia.org. But do we have the time and energy and self-confidence for this self-promotion? This initial step should be made, I’m afraid. But then it is the scholarly community’s responsibility to inform others about the news of an OA publication moving in concentric circles. Furthermore it is also the task of the big names in the guild to promote these publications, as their voice is stronger, it reaches out to more people and is heard more easily. Does this mean that the channels of promotion on the basis of the principle "care and share," a new advertising culture is to be built? A culture that is not founded on profit but on the responsibility for colleagues and for the welfare of the discipline? Maybe.

Images:
Creative Commons: creativecommons.org
GNU GPL: gnu.org
Aaron Swartz By Fred Benenson - User: Mecredis - http://www.flickr.com/photos/creativecommons/3111021669/, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=6587124
Alexandra Elbakyan: By Apneet Jolly - https://www.flickr.com/photos/ajolly/4696604402/, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=47280109

Tuesday, 27 October 2015

Quantitative Analysis with Question Marks

The fall break started two days ago, and I have had just the leisure to get back to writing a short Python script. I have been working on this project for a while, but as a newbie I just take steps forward pretty slowly. The script I am working on is supposed to analyse any text but actually every modification I introduce into it is the result of the problems I face when I run the script to analyse quantitatively the quarto edition of Shakespeare’s Much Ado About Nothing. I am wondering if you have to tune the script for every text. But then this would mean that comparing different texts would be impossible. This, however, would lead too far, so instead of this let me mull over a specific problem.

In this post I am going to share one type of insight into the text that I have gained when working with the quarto text of Much Ado About Nothing. When running the script I encountered a problem. This problem concerns the hyphens in the text, insofar as words divided at the end of lines with a hyphen were counted as two separate words. To overcome this problem I tried to remove these hyphens from the end of the lines automatically, but then I ran into a further problem: the machine either removed them simply but left the words divided without a hyphen, and this was no good, as they remained two separate strings. Or if they were removed and the two halves of the words were united, this was no better either, because then the two lines in which the two halves were located became united, too, and this resulted in the distortion of the number of lines. So finally I removed the hyphens and united the words manually so as to avoid the unification of lines. The manual unification of words was beneficial on a further account as well, as I could make a decision on an individual bases in which line the word was to be placed.

When working on this task, which did not last long, it took approximately 15 minutes, I noticed that actually compound words divided with hyphens appeared in mid-line position as well. So what I did next was writing up a short script to collect all these instances of compounds separated with a hyphen, count the number of lines where there are instances of this and also count the number of lines of the play. Once having these numbers I also counted the relative frequency of the lines in which compounds appear.

Compound words divided with a hyphen in the order of appearance in the quarto edition of Much Ado About Nothing are the following:

['turne-coate,'], ['Hare-finder,'], ['Ballad-makers'], ['warre-thoughts,'], ['ouer-heard'], ['March-chicke,'], ['start-vp'], ["heart-burn'd"], ['mid-way'], ['ouer-masterd'], ['day-light.'], ['Schoole-boy,'], ['ouer-ioyed'], ['tooth-picker'], ['sun-burnt,'], ['working-daies,'], ['loue-gods,'], ['kid-foxe'], ['night-rauen,'], ['out-rage'], ['ouer-heardst'], ['hony-suckles'], ['heare-say:'], ['wood-bine'], ['bow-string,'], ['hang-man'], ['tooth-ach.'], ['tooth-ach.'], ['Dutch-man'], ['French-man'], ['lute-string,'], ['tooth-ake,'], ['hobby-horses'], ['Ote-cake', 'Sea-cole,'], ['Sea-cole.'], ['Hot-blouds,'], ['worm-eaten'], ['cod-peece'], ['gentle-woman,'], ['night-gown'], ['Sea-cole,'], ['eie-liddes'], ['ouer-whelmd'], ['candle-wasters:'], ['tooth-ake'], ['milke-sops.'], ['out-facing,', 'fashion-monging'], ['trans-shape'], ['vnder-neath,'], ['gossep-like'], ['Lacke-beard,'], ['grey-hounds'], ['carpet-mongers,'], ['witte-crackers'].

It seems that out of the 2589 lines of the play, hyphenated compounds appear in 54 lines, and in two lines there are two of these compounds, so altogether there are 56 hyphenated compound words in the text. The relative frequency of the lines in which there are hyphenated compounds is 0.0208574739282 . Furthermore, as there are 22, 171 words in the text, the relative frequency of hyphenated compound words in the texts is 0.00252582201976.


Now why are these numbers important? The significance of these numbers can only be gauged if compared to another text, to other texts, because then a pattern may emerge. But then what kind of texts are to be compared and contrasted to. Those of Shakespeare? Or those of the printer? If Shakespeare’s, only the quarto editions, as these are close in time, or all the early prints, i.e. the First and Second Folios as also books of the same period or only those early printed editions that go back to some form of a manuscript, as Much Ado About Nothing, because then these may reveal something about Shakespeare? Or only those that were published by Andrew Wise and William Aspley, as they were the publishers of the quarto edition of the play, or those that were printed by Valentine Simmes, as it is his employees who created the printed text in the final analysis? Or in reality these features do not have anything to do with Shakespeare but rather with the publishers, i.e. Wise and Aspley, or the printer, i.e. Simmes, and these features should be compared only to books one of these parties printed and not necessarily authored by Shakespeare, as they are the people who are responsible for the text that we can witness nowadays. In other words is this statistical analysis related more to studying the history of the book, or the history of spelling than to studying Shakespeare? Answering these questions might be unavoidable when looking for texts to compare the quarto of Much Ado About Nothing to.