Poor name choice of this blog

When I started this blog I just looked up a few words to pick what I thought would be an appropriate name. Did I mention (of course, just kidding) I don’t speak any Spanish. But after months of working on my project I’ve learned a few things as I’ve been analyzing thousands of machine translations by either Google or Microsoft.

Here’s an interesting “mistake” that led to some study and then a part of my point in this post:

brillante

Vino perfectamente límpido y transparente. Es un factor que tiene que ver con la juventud del vino. Atravesado por la luz parece brillar.

bright

It came perfectly limpid and transparent. It is a factor that has to do with the youth of wine. Pierced by the light seems to shine.

Translating vino as ‘it came’ is strange (but occurs frequently in the wine vocabulary I’m compiling now).  Recalling the familiar saying of Julius Caesar (veni vidi vici or I came, I saw, I conquered) is the clue. The Spanish for ‘to come’ is venir and the conjugation for third person singular past tense is, ta-da, vino! So, in fact, ‘it came’ is a completely reasonable (but wrong) translation. Obviously given I’m looking at lots of definitions of wine terms, vino in this context is, of course, wine. So much for context sensitivity in machine translations.

But the point which I learned by just some brief reading about syntax of Spanish is that most verbs are regular in their conjugations and so pronouns can be deduced unambiguously and thus are usually omitted.

So mistake #1. Yo is unneeded and I should have solely used traduzco (the first person singular present tense of traducir (to translate).

Mistake #2 is a bit less obvious. Yes, comida can mean ‘food’ which was my intent. But in Spain it can also mean lunch which definitely is not my intent. And it can also mean ‘meal’ (the act of eating, not the food itself) which is a bit better.  In fact the authoritative dictionary of Spanish, the Diccionario de la Lengua Española from Real Academia Español has five meanings for comida.

Now really my project is to construct a robust translation tool for menus in Spain. The Spanish menú is too close to English and thus wasn’t Spanish-y enough for me and so I picked comida instead (plus I wasn’t quite sure how to get the ú in the blog name at WordPress).

So traduzcomenú would be a better name but now it’s too late to change.  It’s probably fair that the name I chose doesn’t make sense and thus a clue to a true Spanish speaker how clueless I am. Not a good start if I’m claiming I’m going to build a really good translation tool.

Oh well, live and learn.

Advertisements

Quesos de España – A Great Source

I took a break from decoding menus from restaurants in Spain to look at cheeses that originate in Spain. I’ve done this type of investigation before (previously for Italy) and it’s a challenging task. Names of cheeses can be very inconsistent from different sources. Even with DOP names now more common there can still be inconsistencies.

And, of course, using any online source for raw material has the challenge that its author may be wrong or misspelled names or introduced other errors. And consolidating all the names found in different sources is difficult to automate while simultaneously this is a large quantity of information to attempt to mentally collate especially when one is not conversant in the language.

I’ll explain my process below but in case you just want the excellent source I found I’ll describe it first, even though it was after a lot of searching I discovered it.


While it’s entirely in Spanish and as a PDF not subject to Google Translate when accessed through the web browser this is a very nice document: CATÁLOGO ELECTRÓNICO DE QUESOS DE ESPAÑA (slow to download but worth the wait).

It has pictures of the cheeses and even some of the animals for the milk plus standardized descriptions including items like: Zona de Elaboración (processing area), Ingredientes (ingredients), Tipo de Queso (cheese type), Aspecto Exterior (outward appearance) and Aspecto Interior (interior appearance).

And then even more helpful is this section, Características Organolépticas (Organoleptic  characteristics, I had to look up the English definition on this which is “acting on or involving the use of the sense organs”), which then includes: Textura al Tacto (texture to touch), Olor (odor), Textura en Boca (texture in mouth), Aroma (aroma), Sabor (flavor), Otras Sensaciones (other sensations), Gusto Residual (residual taste), Persistencia (persistence). In case you’re not sure what Gusto Residual means here it is for Gamonedo cheese (from  Principado de Asturias):

El gusto después de ser tragado es: a avellana, con predominio suave de humo (The taste after being swallowed is: a hazelnut, with soft predominance of smoke.)

And here is an example of Persistencia for Curado (cured/aged) Mahón-Menorca cheese:

Media-elevada, presencia de mantequilla fundida, aceite de oliva y caldo de carne. Entre quince y treinta segundos  (Medium-high, presence of melted butter, olive oil and meat broth. Between fifteen and thirty seconds)

In addition to this extensive, informative and attractive PDF there is another part of this site where you can filter the list of cheeses, i.e. Buscador de quesos (Cheese Finder (aka Search Engine)). The filters are: Seleccione (Select): Comunidad Autónoma (Autonomous Community), tipo de leche (milk type), calidad diferenciada, régimen de calidad (differentiated quality, quality regime).  So for example I did search for cow’s milk (leche de vaca) cheeses from Cantabria and all (todas) quality regimes and got:

Marca

(mark or brand)

Tipo

(type)

Procedencia Leche

(Origin of milk)
Comunidad Autónoma

(Autonomous Community)

Picón-Bejes-Tresviso D.O.P. Leche de vaca CANTABRIA
Queso Nata de Cantabria D.O.P. Leche de vaca CANTABRIA
Queso Pasiego Sin figura de calidad comunitaria reconocida

(No recognized community quality figure)
Leche de vaca CANTABRIA

After finding the list you can click on the cheese name for the full information page equivalent to the CATÁLOGO pages. You could either use the search tool to find a cheese you might want to try (some Spanish cheeses can be obtained online) or browse the CATÁLOGO.


back to my process for compiling a list of cheeses

But undaunted by these challenges, from past experience, I decided it was time to assemble a complete and accurate list. This only slightly matters for reading menus at restaurants and more likely would be useful for purchases at retail establishments but again knowing what you’re eating in another country is the inspiration for my project.

So I proceeded with the usual suspects, first doing several Google searches (to get the terms right to provide the best source materials) and then following several promising sources. As usual Wikipedia had a useful page List of Spanish cheeses with a fairly long list (fortunately tagged by region) with some links to pages for the more common cheeses. Having processed this list I immediately assumed the Spanish language version of Wikipedia would possibly have an even better list and it did – Quesos de España. Another seemingly authoritative source, Spanish Cheese Guide, covers all (?) of the DOP names.

From all these sources I generated a single list which required picked a “canonical” name and then finding all the variations from the sources. For example this cheese, Arzúa-Ulloa, appeared in all my sources (compiled thus far) but as you can see under quite different names even including a misspelling.

Queso Arzúa-Ulloa (P.D.O.) Galicia 1 link
Arzula Illoa 2 link
Arzúa Galicia 3
Arzúa-Ulloa Galicia 5 link
Arzúa-Ulloa Galicia 6 link

So after consolidating the list from five sources and choosing what appears to the the “standard” name (for those cheeses that appear on more than one list) here is what I believe is a fairly comprehensive lists:

Abredo, Acehúche, Afuega’l Pitu, Ahumado de Pría, Alhama de Granada, Alpujarras, Andalucía de cabra, Ansó-Hecho, Aracena, Arribes de Salamanca, Arzúa-Ulloa, Babia y Laciana, Barros, Benasque, Beyos¸Buelles, Burgos, Cabrales, Cáceres, Cádiz, Camerano, Campo Real, Campoo-Los Valles, Casín, Cassoleta, Castellano, Cebreiro, Colmenar Viejo, Flor de Guía, Fresnedillas de la Oliva, Gamonedo, Garrotxa, Gata-Hurdes, Gaztazarra, Genestoso, Gran Canaria, Grazalema, Guriezo, Herreño, Ibores, Idiazábal, L’alt Urgell y La Cerdanya, La Adrada, La Bureba, La Calahorra, La Gomera, La Montaña de León, La Nucía, La Peral, La Serena, La Siberia, La Sierra de Espadán, La Vera, Lanzarote, Letur, Los Montes de Toledo, Mahón-Menorca, Majorero, Málaga, Mallorquí, Manchego, Mató, Miraflores, Montsec, Murcia, Murcia al vino, Nata de Cantabria, Oropesa, Oscos, Ossera, Palmero, Pasiego, Pastor, Pata de mulo, Pedroches. Peñamellera, Picón Bejes-Tresviso, Pido, Quesaílla, Quesucos de Liébana, Requeixo, Roncal, San Simón da Costa, Serrat, Servilleta, Sierra Morena, Tenerife, Teruel, Tetilla, Tiétar, Torremocha del Jarama, Torta del Casar, Trapo, Tronchón, Tupí, Urbiés, Valdeón, Valle de Alcudia, Valle del Narcea, Vidiago, Villalón, Zamorano

There are around 30 more where I’ve found at least one mention but I’ll have to search for each of these individually (once I have the complete list) to see if these cheeses really exist (at least currently) or are just a spurious mention in some online list.

Another country menu; Tour de France

I’ve picked up my treadmill pace (and thus my miles on my “virtual” Camino trek) and so I’ve reached Frómista in Palencia province of Castile and León autonomous community. There I found four different eating establishments with online menus so I have a lot of raw source material to translate, analyze and feed into my corpus.

It’s been easier to get more miles on my stationary exercise equipment because now I’ve got the Tour de France on TV to inspire me (more than usual daytime TV shows). While I’ve mentioned I’ve now done 222.5 miles on treadmill I’ve also done 3665.6 miles on my stationary bike in the same time period. When I lived in California, counting biking to work, I usually did about 5000 miles a year so my boring stationary riding is about the comparable distance to what I used to do 25 years ago. But even with boring bike commuting it was a lot more fun riding real roads (especially in the San Francisco Bay Area which has some excellent biking routes) so at least with the Tour on TV I can make that my vicarious experience. So in the sprint to the finish in Stage 6 I managed to do 1.3 miles in the same time the racers did 1.5 miles – not bad, except they were climbing a very steep hill! I once got to participate in warmup laps with professional riders so I have a pretty good idea how much better they are than I am. I was going full out and just barely keeping up with the pros (well below Tour level, just local California pros) who were just loafing along. So I have no illusions of ever being capable of racing and certainly not at 72. But still it’s satisfying to “ride along” with the peleton.

But back to Spanish food and deciphering menus. Of the four possible in Frómista I’m reporting on the first, Villa De Fromista.  At first I thought Google Translate badly botched a few items but on further investigation I believe GT’s problem was due to the unusual HTML structure that made it difficult to tell boundaries between items and so Spanish words were “run together” in the text that Google translated. Since GT claims to use “context” (or sometimes described as using all words as a group rather than individual word-by-word translation) parsing the menu items incorrectly is bound to create confusion for it. But this is yet another cautionary warning to readers who might think in today’s high tech world a smartphone, with machine translation, is sufficient to decipher menus in a foreign language. So machine translation still has a ways to go and so my project to build a superior translation, keyed to the actual structure of menus in restaurants in Spain, still (if I succeed) could be more useful.

So, a few items of interest and I’ll get to the other three restaurants in another post. The restaurant has a MENÚ DEL PEREGRINO (Pilgrim’s Menu) for a mere 11’50€ and the MENÚ ESPECIAL for 19’50 €. It also offers GUARDA BICICLETAS which Google translates as ‘KEEPING BIKES’ and Microsoft translates as the more obvious ‘Bike Guard’ (presumably the same as a bike rack as called in USA) and this fits into my focus on the Tour. As I’ve studied the Camino in detail I have wondered about biking it instead of walking. I did do a long (escorted) ride in Germany and Austria once and I found biking to be a very pleasing pace for touring: not too fast and miss everything like with a car, but not as slow as walking and thus little change in scenery during the day. Since I’m averaging 26.2 miles/day on my stationary bike maybe working back up to 50 miles/day (which was my Germany pace) and thus completing the Camino in less than two weeks should be my focus (plus the possibility of going miles off the Camino to find better food or accommodations, plus fewer crowds).

Anyway back to the menu. The biggest mistake in translation which I don’t think is due to parsing the HTML is:

BACALAO REBOZADO CON PATATAS FRITAS COCO REBOZADO WITH FRIED POTATOES

Battered cod with french fries

where Microsoft’s translation (in green) is much better (certainly more useful). How bacalao became ‘coco’ is a real mystery. rebozado we’ve encountered before and is just a conjugation of the very rebozar (to coat with batter). So this really is a fairly simple item to translate.

And this is kinda funny but obviously a poor translation

REVUELTO DE SETAS REVOLTED MUSHROOMS

Mushroom Scramble

because we’ve covered revuelto already in this blog and ‘revolted’ isn’t even close.

LECHAZO ASADO (‘roasted lamb’, Microsoft got the animal right but missed this is one of the standard references to suckling (unweaned) lamb) and COCHINILLO ASADO (roast suckling pig) were totally botched by Google but it’s so bad it has to be due to parsing issues in the HTML.  Google displayed lettuce (actually lechuga) and chicken (actually pollo or gallina), neither of which is even close. Several times A LA PLANCHA becomes ‘to the plate’ which is a nominally correct literal translation but as we’ve covered in other posts this really means ‘grilled’ (as on iron griddle or skillet). ‘to the plate’ would be confusing it you didn’t know the more useful translation.

And this is an amusing translation that is actually more correct than it first seems:

ENTRECOT DE GANADO (lit: cattle or live stock) MAYOR (lit: older)
  (MADURADO MAS DE 25 DIAS)
ENTRECOT OF LARGEST LIVESTOCK
(MATURED MORE THAN 25 DAYS)

In other words this is just an aged Beef Entrecote where entrecôte (the French spelling) would mostly translate to ribeye. To a steak lover what isn’t in the menu is whether this is dry-aged or wet-aged. Unless the steak is tiny having this priced at 19’50 € (for all three courses) is either a very good deal or unlikely to be equivalent to this item in a premium steakhouse in the USA.

So, as usual, a more careful translation of the menu reveals a bit different view on what one might choose. Soon I’ll cover the other three restaurants in Frómista (that have online menus) as I trudge further west on my virtual Camino trek.

 

 

 

Left Burgos …

… the province, not the city which I left a long time ago.

Like most Americans I have limited sense of geopolitical subdivisions of Spain. Several years ago I learned about the autonomous community divisions and probably know most of them. But these are in turn (sometimes) divided into provinces which don’t really correspond (most of the time) to states in USA or provinces in Canada.

Thus I didn’t really expect to be crossing into a new province, Palencia in Castilla y León autonomous community (the largest in Spain). I discovered this from converting my basement treadmill “hiking” miles along a GPS track of the Camino de Santiago. The best I can do for now is then look at satellite or streetviews on Google Maps to get a clue of what it might be like to be at that spot along the Camino.

So I noticed the Puente Fitero which looks like a relatively new (and attractive) bridge over the rio Pisuerga. That’s approximately the boundary of Burgos and Palencia provinces and my accumulated treadmill “hiking” of 213.8 miles puts me just past Itero de la Vega.  After Palencia it looks like León province comes next before finally crossing into Galicia.

Since my previous look at the route of the Camino was from the movie The Way I was unaware of how much of the Camino passes through Castilla y León, which, frankly looks pretty boring.  The movie had far more scenes from Navarra or Galicia, both of which are a lot more interesting (and green and/or hilly). In fact a lot of views I get in Castilla y León look closer to the Central Valley of California or in some cases even the Cowboy Trail here in Nebraska. I’d certainly not be very interesting in hiking those, especially in summer trail, so this part of my “virtual” trek has dampened my enthusiasm for doing the Camino. Maybe only the short western segment (minimum to qualify) would be better.

But I’ll keep doing my basement miles and converting them to my virtual trek as it remains a good incentive for the boredom of exercise.