Entered León Province

I’m finally out of Palencia province and now in León province in autonomous community of Castile and León. “in”, of course, is to be taken figuratively, not literally, as I just finished enough miles on treadmill in the basement to reach Sahagún, the first town along the Camino on east side of León province. 259.98 miles from the start. That’s not many miles for ten months but I’ll also hit another milestone of 6000 miles on stationary bicycle which isn’t too shabby, in fact, more than I used to do for real when I lived in California and real biking outside was feasible.

Now my “progress” on the virtual trek is boring content so I’d hoped to bring you some new tidbits of information about menus, but, alas I couldn’t find any. Between Trip Advisor and Google I had nearly 20 candidate establishments who have food but none have any online menus. Thus I have no source material to examine. And while the pictures (from the few web sites) or geo-located on Google show plenty of food there are no words. This is disappointing since all the small towns all through Palencia haven’t had menus to translate. I didn’t even see a photo of the blackboards outside restaurants that frequently are menus. And the only photo I found of a menu was the English menu and thus not interesting.

So rather than leave you with nothing about translations I also found the 15 most interesting things to do in Sahagún (effectively all related to religion) and so extracted a few words. Silly me for not knowing Iglesia since it is rather common.

Arte Sacro Sacred Art
Castillo Castle
Iglesia Church
Carcel Jail
Monasterio Monastery
Museo Museum
Oficina Office
Palacio Palace
Puente Bridge
Ruta Route
Santuario Sanctuary
Semana Santa Holy Week
Turismo Tourism

So the trek across Castile is still rather boring so I’m actually glad I’m only doing the virtual version.

Advertisements

Back to some menus

On my virtual trek I’ve recently passed through a number of small towns, along the Camino route, and am now just a few miles outside Sahagún, which is large enough that it perhaps has some online menus. The small mom-and-pop places I recently passed along the Camino don’t have websites or don’t have online menus but there menus are simple. Often there are photos, sometimes even a photo of the menu which is small enough to be written on a blackboard. The dishes are simple and thus don’t represent any new information I can extract for my corpus.

So I went looking around for more interesting menus in this part of Spain. Even the larger towns don’t have much (Castile seems pretty empty, a lot like western Nebraska). But I did find the town of Palencia (the main town of the province of Palencia) had some menus to study. Then I stumbled on another in the town of Saldaña, which is 36km northeast of the Camino route. There I found Twisted House Restaurant, aka, La Casa Torcida.  This menu is interesting in that it has a Menú del Dia for each day of the week and for the weekend.  I’ve mentioned Menú del Dia before – it is a fixed price menu, typically of two courses (PRIMEROS PLATOS, SEGUNDOS PLATOS), plus some extras like bread (pan), drink (vino de la casa or agua) and dessert (postre). But the number of items each day is fairly small and most of the items are similar between days, as I’ll discuss below.

In looking at this restaurant and the few I’ve examined in Palencia it becomes clear that, a) knowing the regional cuisine) and, b) at least in this area, “local” matters and thus has to be part of the process of interpreting menus. In the USA the original “organic” foodie movement has largely been replaced by the “local” movement, i.e. restaurants and farmer’s market focus on food obtained nearby. Since most people in towns at most have small backyard gardens there is even the CSA (Community Supported Agriculture) where farmers may convert small plots to allow citizens to grown their own food. This is fun for foodies in the US but when I encounter this same effect in Spain it gets confusing in terms of extracting vocabulary from menus. What does this mean? So for instance this restaurant has several variations of:

alubias “blancas de Saldaña” con boletus y foie “Blancas de Saldaña” beans with boletus and foie gras
alubias “blancas d Saldaña” con almejas y langostinos “Blancas d Saldaña” beans with clams and prawns
alubias “blancas de Saldaña” con bogavante. “Blancas de Saldaña” beans with lobster.
alubias “blancas de Saldaña” con almejas “Blancas de Saldaña” beans with clams

The key part of this is  which simply means “local” food in the vicinity of Saldaña. alubias (beans) which are described in this blog post. Really they’re just white kidney beans that happen to be grown locally and probably are most indistinguishable from any other beans to anyone but an expert. But the restaurants likes to tell its customers they are getting the “local” product and probably residents believe their beans are somehow special. But to a trekker, is there any difference? So for these items on the daily menus it’s just a white bean stew/soup with various added proteins. In terms of translation what the traveler really needs to know is:  boletus  (a class of mushrooms),  foie (presumably classic goose liver),  almejas (clams), langostinos (prawns, probably not the Italian langoustine, aka. Norway lobster), and,  bogavante (lobster, possibly classic Maine lobster but inland in Palencia probably some lesser Atlantic lobster).

Another instance from the menu is:

garbanzos “fuentesauco“

garbanzos “d fuentesauco” con boletus

Here again fuentesauco is just a local reference to either Fuentesaúco (a municipality located in the province of Zamora, Castile and León) or Fuentesaúco de Fuentidueña (a municipality located in the province of Segovia, Castile and León), each of which has a Wikipedia entry but quite possibly are the same thing. Either way it’s just local chickpeas probably, again, indistinguishable to anyone but an expert.

So, recognizing these “local” references in menus (and not thinking it is some other term not translatable to English) allows reading the menus. But some of the other items on this menu raise more interesting questions (as to what the dish really is).

Now I’ll describe a dish that baffled Google Translate. But as I discovered (and this could be relevant like using a smartphone for translation) the structure of HTML caused problems for Google to parse (which words go together) and thus translate:

lechazo “de Castilla” recien asado letter from “Castilla” newly roasted chickened correct
lechazo “recien asado de Castilla” “recien asado de castilla” recipes
lechazo “recien asado de Castilla” recienated astillo de Castilla “
lechazo “recien asado” de Castilla “recien roasted” milk of Castilla
lechazo “recien asado” de Castilla “recien asado” cheese of Castile

You can see how much trouble Google had with this item, which eventually I realized was primarily due to parsing of the HTML (since Google then uses all the words found to translate as a group). None of these translations is vaguely right – there is no chicken, milk or cheese involved and recienated and astillo aren’t even words. And note that with four different identical items Google translates each differently.Perhaps the use of quotes messed up the Google parsing. But we can deconstruct this literally and figure it out.

First, asado is easy – it is simple ‘roasted’. This is a common term one should know, some restaurants even have asador (the verb) in their name. BTW, now I do see one of Google’s confusion, asadero is a type of cheese.  de Castilla simply is either another “local” reference (of Castile) or it is the style of (usually would be a la Castillia or Castellano). So it comes down to what is lechazo? We’ve seen this before and there is a connection to ‘milk’ (leche). Wikipedia considers this a dish (not an ingredient) but it really means an unweaned baby animal. When it’s a baby cow we’d call it veal but more likely this is referring to a piglet. That leaves recien as somewhat ambiguous. This could mean (literally) ‘recent’ but also might be a modifier ‘just’, as in ‘just roasted’. I can’t quite decide what I think this means (recently slaughtered, recently roasted, recently born?) but it really is just amplifying the idea of a young animal. Microsoft decided this meant “Lamb “de Castilla” freshly roasted”. But it’s simple small lamb or pig chops.

Now for a few other mystery items where the Google Translation was not very helpful.

volovanes de hojaldre rellenos de crema de setas y gambas volovanes  de hojaldre filled with cream of mushrooms and prawns

Again I think the HTML parsing was an issue since Google didn’t even translate volovanes (vol-au-vent; a small hollow case of puff pastry) and hojaldre (literally: puff-pastry, which is kinda redundant given the definition of vol-au-vent)

judias verdes con jamón green jews with ham

I’ve already found that judias is another term for beans that somehow historically was connected with the Jewish population in Spain and thus the source of Google’s confusion – this is just green beans.

crepes caseros de pollo salteado con verduras de la huerta chicken house creams salted  with vegetable vegetables

‘house’ is possibly translated from caseros but this usually means (and is common on menus) “home-made”. salteado is not salted (although Google’s translation can be understood, but it really should be sautéed but it’s not clear whether this applies to the chicken (pollo) or the vegetables (verduras). And the vegetable vegetables really means vegetables from the garden, perhaps another local reference, i.e. whatever vegetables are locally and seasonally available from some nearby garden (huerta). And crepes are crêpes not ‘creams’.

ensalada de la huerta con crujiente de cecina salad of the garden with crumbler of Cecina

Google didn’t translate cecina which Microsoft translated as ‘jerky’.  But jerky is probably not quite right as this could be any type of dried meat. Translating crujiente is usually crunchy but could also mean ‘crumbs’ (or basically we might say bits); IOW there is just some crumbled up dried meat added to the salad.

corral is used several places in this menu (huevo de corralpollo de corral) and has multiple translations with ‘yard’ or ‘barnyard’ as the most likely. IOW, in US this would just be ‘range’ or ‘free-range’, yet another foodie reference that is common today.  Google once somehow translated this as ‘cork’. Google also failed to translate picatostes which is probably best understand as ‘croutons’. Somehow Google decided puerros is ‘doors’ instead of the more likely ‘leeks’.

But one more that took some work:

medallones de rabo de buey deshuesado en su jugo bird meadows of bone deposited in its juice

Google’s translation is, frankly, nonsense. So it took me some work. Fortunately I’d encountered (and remembered) rabo de buey from elsewhere which really is easiest understood as ‘ox-tail’. deshuesado in this case is best understood as ‘de-boned’ or ‘boneless’, i.e. the bone has been removed. Then the remaining meat is sliced into medallions which most foodies would know are just thin slices. Microsoft got a lot closer with “medallions of ox tail boneless in its juice” but it put ‘boneless’ in the wrong place in the phrase.

And, finally (there are more, but I’ll end with this) is:

Sanjacobos de lomo rellenos de jamon y queso Sanjacobos de lomo filled with ham and cheese

where I eventually found:

A “San Jacobo” is a popular “merienda” (afternoon snack) or tapa in Spain. It consists of a slice of cheese between 2 slices of cooked ham, which is breaded and then fried.

Its name refers to Santiago-Jacobo-Yago, patron of the city of Basel , and the pilgrimage by Christians to his supposed grave in Santiago de Compostela (Galicia, Spain), Camino de Santiago.

The closest equivalent seems to be Cordon bleu, otherwise (although this dish usually lomo, almost certainly meaning pork loin instead of chicken in the classic French dish), perhaps, known as schnitzel or Monte Cristo sandwich. The Wikipedia article about Cordon bleu says:

A variant popular in the Asturias province of Spain is cachopo, a deep-fried cutlet of veal, beef or chicken wrapped around a filling of Serrano ham and cheese.[11] In Spain, the version made with chicken is often called san jacobo.

So this menu (or set of daily menus) presented some serious difficulties in machine translation. I’d assume most/some of these issues would appear if you were using a smartphone trying to decipher this menu. Of course if you were fluent in Spanish you could just ask for a description of these items. But all this shows the challenge I have of building a more effective menu translation tools. Not only do I need a large vocabulary and some smart parsing of menu items but I need to know all this “local” terminology (geographic references) and/or just some fairly extensive background of the cuisine of the various regions of Spain.

How I’ll put this menu (and my translations) into my corpus poses some interesting technical questions. And then, to avoid the common Google mistakes, which they think of as “context’, variations in the word order and/or extra words pose some real challenge to effective translation. Google says it’s translation is not dependent on syntax and grammar of a language but I believe my app has to be fairly smart about all this.

And finally this menu is good evidence of how much work I have to do. Almost every menu I try to decipher has some interesting peculiar bits to get an effective translation. This takes a lot of research and guessing and double-checking for my intelligence to get a reasonable answer – how do I code my thought processes in an app?

We’ll see but perhaps, Dear Reader, you can see what an interesting challenge this is.

 

Virtual trek landscape observations

This isn’t a Spain food post but is my observations, via remote means, about my virtual trek along the Camino. I’ve mentioned that, as an incentive to exercise, I convert miles I do on the treadmill in the basement to position along the GPS track I have of the Camino de Santiago. I then use Google’s Streetview and satellite images to try to “see” what the trek actually looks at where I’m “at”.

What triggered this post was my searching for restaurants along the trek that have online menus. I’ve reached Ledigos in Palencia province of Castilla y León  autonomous community. I’ve been trekking across Palencia for over a month now (an actual trek would be a few weeks). Ledigos has a few places to eat but nothing online I can analyze.  That’s been true of all the various towns along this stretch of the Camino.

In my previous post I talked about extracting Spanish food terms from various online lists. This is useful for expanding the corpus I’m building to then have code extract an extensive vocabulary that would be useful in interpreting menus in Spain and deciding what to (or not to) order. Lists are helpful but not very interesting to process. The Spanish terms with English definitions/translations are just mindless mechanical work (automating processing these lists is very difficult). I don’t mind tedium of this kind of processing but I don’t learn much. The dictionaries or glossaries that are entirely in Spanish are a bit more interesting since I use machine translation of the Spanish to English and often these translations require additional investigation to find out what the terms really mean. That’s a bit more interesting and helps me learn a bit (not just mindlessly accumulate raw data).

But restaurant menus are far more interesting. They often use terms that defy machine translation and thus require a lot of investigation. Thus I learn a lot from these. So being a bit bored with processing lists and not finding any online menus along this stretch of the Camino I used Google maps to find larger towns that are more likely to have menus to see. In the province of Palencia there are not many of these towns; in fact, only the city of Palencia is large enough to provide some online material. And that’s what I’m working on in my food terms part of my adventure and will have some posts on those menus.

So searching the map of Palencia revealed even more of the landscape than I’ve “seen” along my virtual trek. And, frankly, what I see is hot, dry, dusty and boring countryside with sleepy little nondescript towns. A real trek on this stretch would not be very interesting.

Here in Nebraska I have access to three trekking trails that are rural and would require more than a day to walk. First is the nearby MoPac, a rails-to-trails conversion that starts about 20 miles west of Lincoln and goes into the city. For those of you who aren’t familiar with this concept the right-of-way that was granted to build railroads often reverts back to state when the railroad is abandoned. While the railroad was operating the route was transformed to level with gentle curves, either filling in depressions or cutting grades through hills. Here there are numerous small streams so the railroad required bridges and those can be refurbished to provide the walking path. So the MoPac (and others) make for easy and sometimes pleasant walking. Since many of these railroad routes were built when train engines still burned coal (or even wood) there is usually additional area along the side of the tracks so embers didn’t ignite crops or houses. So today the MoPac is overgrown with “wild” brush and trees, often 50m or so on both sides of the trail. As a result hiking is often in the shade, something definitely not the case in Palencia. But there are non-shady stretches along the MoPac that can be seriously hot in summer with intense sun.

A second trail is the Wabash, another rails-to-trails in Iowa. It starts on the south edge of Council Bluffs and continues all the way to Missouri, nearly 70 miles, so more than a day hike. This trails is even more overgrown and shaded than the MoPac. So despite being surrounded by farms, usually within 50m of the trail, it feels more like wilderness. The Wabash goes through a number of towns and a few of those now have refreshment (but not overnight lodging) for trekkers. Doing the entire length of Wabash would take multiple days (possibly doable in one day on a bike, although biking speed on the unpaved trail is much lower than paved road biking, so doing the entire Wabash is harder than doing a Century ride). Thus a trekker would need vehicle support at the end of each day to find lodging. This contrasts with the Camino which has lodging, water and food at convenient daily hiking intervals, undoubtedly one of the main appeals of the Camino, all the infrastructure to support peregrinos.

In segments (and recording with my GPSr) I’ve done the full length of both of these trails and very much appreciate that the states chose to use the abandoned right-of-way for recreation. But in some ways I view these trails as practice (or an appetizer) for a real long-distance trek.

So now I’ll tie this together with Palencia. A third long-distance trail is the Cowboy Trail. Nominally (except it’s unfinished) it could be the longest trekking trail in the US. It’s a bit longer drive for me to reach it (I can get to MoPac or Wabash in an hour) so I don’t normally consider hiking any of it. But the Cowboy Trail is right next to a highway I driven multiple times. So it turns out the Cowboy Trail is very similar to the Camino, at least the long stretch in Castilla y León and especially Palencia province. It has little shade and so also is hot and dry and flat and passes through either monotonous fields of corn or soybeans and further west (even drier) through pasture land. It looks a LOT like the Palencia stretch of the Camino.

In my other hobby I do an online recreation called Geodashing.  This involves trying to reach completely random “dashpoints”, just a latitude and longitude. Geodashing requires getting within 100m of the coordinates without trespassing on private land so I take each month’s new set of dashpoint (about 30,000 each month, worldwide) and analyze if I can reach them (i.e. drive close, often on very remote and sometimes poor roads). As a consequence, having done this for over 10 years, I’m pretty good at analyzing countryside by satellite photos and sometimes Google Streetviews. Looking down from space on features on the ground takes some practice to imagine what there is at ground level. So I’ve had lots of practice with this and thus far doing the same for the Camino route I think I have a good idea of what is around the Camino route. For the Camino there are actually far more Streetview paths (the Google cars seem to have gone on all the little roads in Spain, more so than here in the Great Plains). I’m sure it would be difference to experience it for real but I think I have a good idea about the landscape.

I first learned of the Camino de Santiago from the movie The Way. Later I learned there are actually many branches of the Camino and so more correctly the part of the Camino I’m following is The French Way, aka, Camino Francés. While this is a very ancient pilgrimage route it was closed during the Moorish occupation of Spain but now is the most popular route. The movie makes the Camino far more visually appealing since it mostly shows scenes in Navarra and Galacia; both of these are wetter (thus greener and often wooded, more like wilderness backpacking trails) and have significant topography (i.e. the Pyrenees). But the bulk of the French Way is actually in flat and boring farm country. While the crops in Spain are different than the Cowboy Trail in Nebraska farmland is farmland and not the appealing scenery of other parts of the Camino.

But wrapping this long post up I want to comment on another interesting feature. That is the relative lack of human habitation outside the small towns.

Several decades ago I did an organized bicycle trip through southern Germany and Austria. Many people on that ride were from the midwest US so we discussed differences between rural areas in Germany vs the US. A very pronounced and obvious difference was the lack of farmhouses out among the fields. It seemed, since on a bike we notice hills, that all the farmers lived in small towns on hills and only crops are present in the bottomlands. At first we speculated this was a historical defensive choice as farming is many centuries older than in the US and Europe had a whole flock of wars. But we later learned the more obvious answer was that hills are drained and dry and so not very good for crops so houses were built there leaving the better-watered areas just for crops.

So that is another very noticeable difference between the three tails I described here in Nebraska and the Camino. When you can see beyond tree cover there are farmhouses everywhere on the Nebraska trails. And from the satellite and Streetview images there are almost none in Spain, just like Germany. The other really noticeable difference (which is correlated with lack of farmhouses) is that fields here are much larger and usually quite regular. This is a consequence of the land policies in the US where the government acquired vast tracts of “empty” land (as the sarcasm goes, “stolen fair and square” from the original peoples) and made these easily available to homesteaders. Thus, at least west of Ohio, most of the farm country in midwest US has a grid of roads (many now abandoned but still visible in satviews) on one mile spacing, aka, “section lines”. A section in the US is 640 acres or one square mile. In fact, an completely different geographical reference system is used, known a township/range and section than longitude or latitude.

The homestead act allow people to acquire a quarter section (160 acres) often free or at least very cheaply. So often each square mile had four farm houses, now with many abandoned. Unlike the irregular patchwork quilt of fields I see in Palencia fields here are almost entirely regular (not true in the older parts of the US, i.e. the eastern states).

In the US those original homesteads have mostly been consolidated into larger blocks of land. With automation it’s entirely feasible (and economically necessary) to farm at least an entire section if not several sections. At the time when this land was originally opened (19th century) such large farms were not feasible.

I happen to know all this as I am in the process of obtaining title to a “small” farm in Oklahoma where my mother’s family lived. That farm is a mere 80 acres, 1/2 of the original tract of the typical quarter section of land grants. Before WWII it was feasible for a family to live on such a small farm, raising some crops for income and others for personal consumption. The titles to the land I will inherit are a mess, stretching back to the early 20th century. But one feature of land ownership, now reversed with “corporate” farming, was original tracts get divided through inheritance. So my little 80 acre farm will have three owners (once all the legal process is completed). My father’s family farm was divided among 14 owners. So in the US there are these two competing trends, dividing larger tracts into smaller ones and then (usually through sale by heirs who don’t want the farm or small farmers who can’t economically farm such a small tract) into much larger tracts.

Now looking at the aerial view of Palencia it’s clear the process of subdividing land has been going on a very long time and thus creates the patchwork quilt of small tracts. When I toured Portugal two decades ago, especially in the area south of Lisbon the “modern” trend typical here in the US was occurring. Small farms were not economically viable once Portugal joined the EU so small tracts were being consolidated into larger ones. Much of the farm country south of Lisbon looks a lot like the midwest US. In fact, I was surprised to see the center pivot irrigation systems sprouting up with equipment that was produced in Nebraska (the origin of the invention of center pivot irrigation, now home to most of the producers of that system). Palencia seems to have escaped this consolidation process but I suspect some of the competitive economic pressures of the EU will lead to more consolidation in Spain as well.

So the lack of farmhouses actually out on the land is, I speculate, primarily economic (not defense). Land is simply too valuable to waste by building even just farmhouses on it. So the farmers live in the small villages in a more urban land use pattern. Since the farms are still small in Palencia there are many villages, as there were in Germany as I found one my bike ride there.

Having so many villages, often just a few kilometers apart, was very handy for our ride in Germany. Most of the towns had at least a gasthaus and often a market and/or a bakery. This made obtaining food and water easy. If a town didn’t have what we needed the next town was 20 minutes away. But that’s on a bike. Walking the towns are a couple of hours apart and of course that’s what I’m seeing on the Camino. Most of the small towns on that route have one or more albergues. It wouldn’t surprise me that on peak days peregrinos out number the local citizens. When I was looking off the Camino in Palencia the amount of lodging and restaurants, in the villages, declined, for the obvious reason they wouldn’t have many customers.

Now this is where I can make another comparison observation. Through geodashing I’ve been through a large number of small villages, mostly here in the Great Plains. And these villages are wasting away. If they were big enough to have some shops the now nearby Walmart (outside the taxing authority of the town) has driven those out of business (now Amazon is helping finish the job as seeing delivery trucks in the middle of nowhere is now much more common than when I first started geodashing). So these towns are dying. As a result they have few resources, either food or lodging for travelers. So along the three trails I described here long-distance self-supported trekking is basically impossible.

So what does all this mean? For me it has reduced my interest in doing the Camino. Too much of the route is this boring and hot/dry countryside with boring little towns (from Streetview also with many abandoned buildings like towns here in the midwest). Simply not very interesting.

Now the Camino was really not for recreational tourism. Its origin was the religious notion of pilgrimage. Where the route was interesting or the towns were interesting is mostly irrelevant from the classical pilgrimage POV. But all those little towns with resources for trekkers has meant most modern pilgrims are largely doing recreational tourism. And that would have been my focus. In my younger days I did quite a bit of backpacking on the Appalachian Trail or the Pacific Crest Trail. Well my days of sleeping on the ground and eating freeze-dried food are over, my old bones want a bed and hot food. So a long hike on wilderness trails in the US was not on my agenda. So seeing the movie, The Way, I thought the Camino looked like a great alternative. Also with lodging and food the backpack can be a lot lighter than my wilderness backpacking. So it looked attractive.

But now, after my virtual trek, it looks less interesting. Spain is still appealing but I suspect I’d do my conventional tourism (with a car and mostly cities) if I get the chance to go. So I might see bits of the Camino (as I have the Cowboy Trail here) but I doubt I’ll walk it. This is disappointing to me to see the reality of the Camino as rather different than the romanticized view of the movie. The Camino can still be great, certainly as religious or spiritual pilgrimage or as a way to meet a lot of people with the many hours of trekking as an opportunity for conversation. But I was looking at it more like doing the Appalachian Trail but with beds and restaurants. And I think that’s what I’ve learned – too much of the trail would be just hot and dusty and tiring. Many people find the Appalachian Trail fairly boring, often called the “green tunnel”. While there is some spectacular scenery much of the AT is just walking through dense trees with no sights visible. The Pacific Crest Trail, OTOH, is much higher and much of it (at least the California stretch where I’ve backpacked) is dry and so there are some grand vistas.

So it’s all a question of what goals one has for a walk. I liked the Pacific Crest Trail but am now too old to do that. And now it looks like the Camino is out too.

So where do I look now?

Back to work – lists

As I don’t have any more travel planned I can get back to work, perhaps with a renewed effort. So I returned to looking at lists, at least three I’ve found and with more to go. Lists come as: just translation of terms in English and Spanish, glossaries and dictionaries where dictionaries supply an actual definition and glossaries sometimes just provide translation (where literal is possible) or definition otherwise. The Net is full of these but using them can be a challenge. Also I’ve usually looked only at these lists where the terms are Spanish but the translation or definition is in English. It’s more interesting, although more work, to get the lists entirely in Spanish. And ideally as apply to Spain rather than anywhere Spanish is used.

So in my first attempt to build up a translation dictionary I only used lists I could find. It never dawned on me to use purely sources in Spanish and in particular menus, but of course machine translation has advanced a lot since my V1.0 attempt years ago so now sources entirely in Spanish and especially as applied to Spain are my primary sources.

But lists provide a lot information in a hurry. And despite the issues they often provide terms that are unlikely to be found elsewhere. But the biggest issue is that whole thing of Spanish throughout the world versus Spanish gastronomy terms for Spain. As I’ve mentioned tortilla is common in western hemisphere but something entirely different than you’d get in Spain even if the menu does say tortilla patatas. Now where lists might include New World terms not used in Spain it’s just a waste of time, at least for my purpose to process them. But when they conflict in meaning between Spain and elsewhere that is a problem.

So I’ve been crunching through three lists. Finding more lists is a lot easier (at least until I’ve found most of them) than processing the lists, especially when the lists are entirely in Spanish. Plus some types of webpages are hard to “mine” (also known as scraping when code is doing it). Web authors design pages to be most useful for their intended audience and not for someone accumulating a corpus. And even when I’ve processed lists I have to be careful with the whole copyright issue. If I published (except in the fair use case, i.e. a small sample with attribution) any substantial portion of any list I find that is improper. But since my real notion is accumulating a large corpus from many sources and then basing my final translation vocabulary on a meta-analysis of many sources I think I should be OK. Also whenever I only have a term translation from a single source I need to be suspicious of the accuracy of that as well.

So thus far I’ve looked at: 1) the Gallina Blanca Diccionario which is from a website in Spain representing a food company producing packaged products for Spain markets and supplied the diccionario to aid their users of the recetas they also provide; this has Spanish terms and definitions in Spanish but does not apply, at least exclusively to Spain; 2) Nitty Grits, a glossary with Spanish terms and English definitions, not exclusive to Spain, but as I learned after crunching through most of it each term is clickable and often (not always) then indicates where this term is used; Nitty Grits is a large list and allows me to get fairly unambiguous definitions (since they’re in English) and avoid the often incorrect machine translations (such as occurred in Gallina Blanca); and, 3) now I’ve return, since doing some work by in May to a complex website, ARecetas, a recipe site that then has multiple glossaries especially the largest and most directly useful, Glosario de Alimentos.  And there are more I’ve found but haven’t yet crunched through at all. Of these ARecetas glosario is the hardest to process so I only briefly looked at in May and instead focused on Nitty Grits. But for several months Nitty Grits was not operational (at first I thought they might have blocked me but that was not the case).

Anyway now I have more issues having finished two of these sources and now resumed work on the third. First, the way I’ve extracted information (often a tedious process) is inconsistent between the three lists (meaning the tables I created in MSWord manually). Second, my notion system was inconsistent, i.e. I annotated much of what I found with no particular notation as to what is original source text and then my annotation. These issues meant I can’t possibly consolidate the three lists manually. So I had started some code to create a consistent format across all lists (in XML which is more robust than just text in MSWord with a few fonts and colors). I was able to do Nitty Grits fairly easily but ARecetas and GallinaBlanca are toughter, i.e. it’s not just code I need, but I have to go back to the manually compiled lists and use consistent inline markup so the code can parse all entries to the common XML I want for all three lists.

Now I need to finish ARecetas (and perhaps also some other smaller sites I found and also do a thorough job of searching) before moving on to the real world. Once I can convert each list, with my annotations and markup, to a consistent XML structure then I can attempt a “merge”. Once that is done I can then look for agreement or disagreement between the sources (as I processed them) and start fixing errors or doing more searching to get more accurate answers (although without wasting much time on non Spain terms).

People who compile lists usually have some other work. They usually want to get their list with minimal effort to achieve their purpose. Simply put, this means they make mistakes, sometimes even blatantly obvious to simple analysis, sometimes more subtle. I’m well familiar with this from my career, a concept of “good enough”. No compilation of information is ever perfect anyway so it’s more a question of how good does it need to be for the intended purpose versus how much work (usually measured as cost since some paid person is doing the work). So online lists have many flaws. And it’s not just online lists. I’ve bought a few books about food in Spain back in my V1.0 effort and these books have inconsistencies and errors (where error means they disagree with other sources). I’ve looked and I’ve never found a “best” or even highly accurate and comprehensive source.

And that’s part of why I’m even doing this project. Unlike the other people creating materials, either free on the Net or in for-sale published works I don’t have a cost issue with my work. As I’m retired and unlikely to ever even be a temporary consultant the marginal value of my time, measured in money, is zero. Therefore I can spend an infinite amount of it trying to be as accurate and comprehensive as I can be, even (and that would be fun) doing original field research, i.e. actually going to lots of restaurants in Spain with some consultant I could hire who’d be fluent in Spanish and cooking (then the bills do add up). So at least my “free” effort is just a question of how much work I wish to put in it.

So I do believe, despite my lack of fluency in Spanish language, it is feasible that I could compile the best list, meaning the most comprehensive and accurate. Of course my list would have mistakes too but I think it could be better than any I’ve seen. AND, if I write good code to does the bulk of the work consolidating the raw materials for my corpus and then extracting I should have an easier time making corrections, especially as my targeted application is either machine-generated webpages or a smartphone app, i.e. updates should be possible once I actually get feedback (too many sites or apps fail to take advantage of the knowledge of their users to provide very valuable feedback to constantly improve the product, either its usability or its underlying database of Spain culinary terminology.

So I hope to get back into it and finishing these three lists would be a critical milestone because then I can really get down to designing my corpus and the code for importing and consolidating and proofing the information in the corpus.

Still moving, sadly

I used some of this title line back in a post on June 27 where I described that slowdown in both work on the vocabulary and my physical exertions.  However this month, August, was even worse both in terms of my progress, posting and personal life.

I’m not mugging for sympathy but last Saturday I was attending services for my sister, my only sibling, who died from pancreatic cancer on August 22nd.  There is a connection between her and my work on Spanish culinary terms which I’ll explain. Fortunately I was able to visit her while she was still lucid for her last (77th) birthday. Her death the day after my birthday was a shock. It was all very sudden. She loved her life and was looking forward to more of it – it’s not fair.

My sister loved travel. She had recently covered two places in the world she’d missed, Russia and India. She really only had one place left on her bucket list, Peru. Unlike me she’d been in Spain multiple times. IIRC her first visit was as a chaperone, cheerleader and tutor for her college’s football team, on one of those goodwill type visits of US sports teams to play exhibition games with European teams.

The reason my sister applies to this blog is that I had discussed with her my interest in learning to read restaurant menus in Spain, under the assumption that some day I would be doing it in person. She thought my idea was wrong; instead she advocated learning conversational Spanish so I could query the servers about the food. Her idea was one she had lived. She was more interested in food in Mexico but also learning Spanish. She made three trips, either to a Spanish-only cooking school or to live with families as process of learning “native” Spanish. She worked very hard at it (it was her fifth language). She even found Spanish speakers in her hometown in Ohio and shared cross-teaching where she’d help them with English and they’d help her with Spanish. In addition to Mexico she traveled to Puerto Rico primarily for restaurant visits but also she’d been to Guatemala (again with food but primarily art interest) and was learning Venezuelan cuisine from her friends in Ohio.

So she did what she advocated for me. But I never told her an anecdote about her Spanish. I have a sister-in-law who works on educational materials for ESL (English as a Second Language). She knows some Spanish but isn’t fluent. But her work brings her in contact with many native speakers. And she has a ear for languages. She had gone on a group trip to Oaxaca that my sister had organized and so observed her speaking to the locals. She joked that my sister’s accent was so bad she was barely understandable to native speakers. My sister had a PhD in English Literature and was very academically inclined, so naturally she had mastered grammar and vocabulary. But when it came to speaking she wasn’t great.

Now this anecdote also had a bearing on me. Just before I left California 20 years ago I had enrolled in Spanish classes. I’d gotten some books, a couple of DVDs and had tried to learn Spanish on my own. It was a complete failure. Having some conversational ability in Spanish, in California and some other parts of the USA is a survival skill. Most of the people I might hire as skilled labor for projects around my home had minimal English knowledge and so my being somewhat fluent in Spanish would have been helpful to discuss the work. Also on various bike rides I ended up in towns where the only language spoken was Spanish. But alas I have no talent for languages. I did learn some French and some German and could barely get by (using French in Quebec and German in Germany and Austria). But I just could never either hear or speak the sounds needed for Spanish. I sometimes watch Spanish TV programs here and the words go by so fast I catch almost nothing. It’s a hard language for me. OTOH, the written part is not so bad. Frankly it’s easier than either French or German and so I can do the “book learning” part of Spanish, much like my sister. But I suspect my accent would be far worse than hers and no one would understand me. So I’ll persist in my project focused on written materials, i.e. restaurant menus.

So that’s enough on that subject, why then am I labeling this post about my progress on my virtual trek. Despite spending most of August either driving or at my sister’s house I did manage to get some miles in.  As I mentioned in my previous post I was doing around 25 miles/month in the early months of the year but had fallen to about 14 miles/month. So in August I did manage 17.55 miles. Of course this is nothing compared to the requirements of pilgrimage on the Camino, literally my monthly amounts being around the requirement PER DAY! to complete the Camino in the standard time. But to my defense I also did 355 miles on stationary bike even in the few days I was at my home with my exercise equipment. And August was the end of my most recent year of records and so I’ve done 5698 miles of biking which isn’t too bad. In California where I regularly did recreational rides as well as commuting to work on a bike I only did about 4000 miles a year and that was 25 years younger than now.

In fact, even though walking seems the best way to do the Camino I am reconsidering whether I might try it on a bike instead. I once did a two week escorted trip in Germany where we averaged about 50 miles a day. But there the overnight stops were arranged and luggage was carried in the sag wagon. For the Camino it’s a bit tougher, both to find lodging (and even food) and carry cargo. I have done one bike camping trip along the California coast, again in the 50 mile/day range. Having all my camping gear on the bike made the riding more challenging (even requiring walking too very steep hills despite my low gears that had always worked, even in the Sierra Nevada climbs). It was quite a bit harder than an escorted trip. Of course all sorts of escorted trips exist for the Camino so maybe I’ll be realistic and do one of those.

So on the last day of August I did reach another destination along the Camino at 245.63 miles, the small town of Calzadilla de la Cueza. There are a couple of albergues (and some private rooms) and perhaps one restaurant (no online menu to translate). But what was a bit interesting was the previous stop with lodging and food was 10 miles earlier. On a bike that’s no big deal but on foot, if one had arrived at Carrión de los Condes and discovered there were no spaces continuing on to Calzadilla de la Cueza would be difficult. That is one of the challenges I see with my doing the Camino solo. I’m not quite as flexible as I was 50 years ago and so being relatively certain I have a reasonable place for overnight is pretty critical. So this gap I covered in August is the longest I’ve encountered thus far.

And this is one of the appeals of the Camino. It’s so popular there is a lot of infrastructure to support trekkers. In contrast the California coast (or even worse the trip across America) can have vast distances with little facilities, even water can be a challenge. OTOH, biking in Germany was fine as the distances between towns was a easy ride and there were many towns. BUT local customs matter – Germany had weekday ruhetags (rest days) and arriving in a town on that day, forget staying in the local gasthof. So a solo traveler can find themselves in a bit of a jam.

So again I may have to consider an escorted trip as my days of solo travel may be past.

So I hope to do a bit better this month on both my treadmill and biking distances. The visits to my sister interrupted my schedule but at least the last one where she was alive I will cherish and be happy I wasn’t on the Camino and unable to see her one last time. I will be glad to be home and exercising rather than that very sad traveling.