Wine Terms

In my last post I mentioned I was using several websites (and pages within those sites) that had English translations to extract side-by-side human English translation of the (presumably) original Spanish. OK, done – so what? Like I’ll be doing with all sources then I begin an extraction process to add pairs (words or phrases) of translations to my corpus. A key part of that also has to be asserted some measure of “certainty” whether the translation is correct. Using a probability type measure (0.0…1.0 obviously fits). Then the corpus analysis program can find as many of the same pair as it can and evaluate a new certainty, i.e. something like – lots of pair instances that are the same but possibly each low certainty may be as good as few of a pair with high certainty. An interesting question, then, is human translation (relatively rare) of websites (mostly menus) more reliable source of information than machine translation.

Of course the extraction process itself (which I do and therefore is subject to error) plays a role as well so I’ll use my small corpus of wine webpages to extract a set of pairs and then use any other sources of wine terminology to confirm/deny my pairs (just manually, so I understand the data, before trying to write code to do this). So here’s my result:    (scroll down past list for more of this post)

Abierto 2 open
Acerb 2 acerbic
acidez 1 acidity
ácido 1 acid
Aciete Esencial 2 essential oils
afinamiento 1 refinement
afrutado[s] 1 fruity
agradables 1 nice, pleasant, agreeable
Alegre 2 zingy
Amoratado 2 inky
Amplio 2 big
Añada 2 vintage year
arcillo 1 clay
Armónico 2 harmonious
aromas 1 aromas
aromática 1 aromatic
Barrica Bordelesa 2 Bordeaux cask
barrica 1 cask or barrel
beber 1 to drink
Blanco Seco 3 Dry White
Blanco 2 white
boca 1 literally mouth, but can mean palette in wine tasting context
Bodega 3 winery
Bodeguero 3 winemaker
Bota 2 butt
botella 1 bottle
Botritis 2 botrytis
brillante 1 bright
brotaciones, brotación 1 [not found] budding ? (derivative of brotar)
brotar 1 to sprout, bud
calidad 1 quality
campaña 1 growing period, season
campo 1 field
canela 1 cinnamon
cánones del clasicismo riojano 1 classic Rioja style (not literal)
Capa 2 layer
cata 1 tasting (action of)
cereza 1 cherry
Cerrado 2 closed
Clarificación 2 fining
clásico de Rioja 1 Rioja classic
comarca 1 region, district
complejidad 1 complexity
Complejo 2 complex
Corcho Cork
cosecha 1 harvest, crop; vintage
Crianza en barrica 4 Aging in barrel
crianza en madera 1 aged in wood (literally, cask colloquially)
crianza 1 aging
cuerpo 1 body
Dejo 2 aftertaste
Denso 2 dense
Depositos 4 Deposits
Dorado 2 golden
Dulce 2 sweet
Elaborado por 3 Produced, matured by.
elegante 1 elegant
Embotellado por 3 Bottled by
Embotellar 4 To bottle
en barrica 1 in cask or barrel
envejecimiento 1 aging (also laying down)
equilibrado 1 balanced
equilibrio 1 balance
Especiado 2 spicy
Espeso 2 thick
Estructura 2 structure
Evolucionado 2 evolved
expresivo 1 expressive
Fermentación alcohólico 4 Alcoholic fermentation
Fermentación maloláctica 4 Malolactic fermentation
fermentación 1 fermentation
final de boca 1 “finish” (literally end/finish of mouth)
final 1 after-taste
fino 1 fine
florals 1 floral
fresco 1 fresh
frescura 1 freshness
frutos cítricos 1 citrus fruits
Fuerte 2 strong
Graciano 1 red grape variety
grados 1 grade or degree (but alcohol by volume)
Heces 2 sediment
Hoja 4 Leaf
Hollejo 2 grape skin
Joven 2 young (little or no aging)
Jurado de Cata 2 wine tasting panel
Lágrimas 2 tears
Levaduras 4 Yeast
Lías 2 lees
limpio 1 clean
Maceración Carbónica 2 carbonic maceration
Maceración en frío 2 cold maceration
maceración 1 maceration
madera 1 wood
madura 1 ripe, mature
madurar 1 to mature
Manchado 2 literally ‘stained’
manzana 1 apple
maridaje 1 literally marriage or combination; food matches/pairings
Mazuelo 1 red grape variety
mezcla 1 mixture, blend
mosto 1 must (grape juice)
nariz 1 nose (also aroma)
notas 1 notes
olores 1 smell (scents in corpus)
Oro 2 gold
Oxidación 2 oxidation
parámetros de calidad 1 quality indicators
Pasa 2 raisin
Pepita 4 Seed
Perfumado 2 perfumed
Persistencia 2 persistence
Pimienta 2 black pepper
postgusto (posgusto) 1 [not found] after-taste
Prensa 4 Press
prensado 1 pressing
pulidos 1 polished
Rama 2 branch
Recio 2 gutsy
Redondo 2 rounded
Refrescar 2 refresh
Regaliz 2 liquorice
Roble Americano 4 American oak
Roble Francés 4 French oak
roble 1 oak (as in the barrels)
Rojo 2 red
Rosado 2 rosé
sabor 1 flavor, taste
Sabroso 2 flavorsome
Seco 2 dry
sedoso 1 silky
Semidulce 2 semi-sweet
Semiseco 2 semi-dry
sensación 1 sensation
Suave 2 smooth
suelos 1 soils (also ground, floor, land)
Tabaco 2 tobacco
tanino 1 tannin
temperatura controlada 1 controlled temperature
temperature de servicio 1 serving temperature, aka, best served at
Tempranillo 1 grape variety
terciopelo 1 velvet
Típico 2 typical
trasiegas 1 decant (rackings in corpus)
untuoso 1 literally greasy (aka unctuous), but nicer means ‘smooth’
uva 1 grape
Vainilla 2 vanilla
valores 1 values (as in levels of an indicator)
variedad 1 variety or varietal
vendimia 1 vintage, grape harves (whole process)t
Vid 4 Vine
Vina 3 Vineyard.
viñedos 1 vineyard, vines
Vino blanco 4 White wine
Vino de calidad (Quality wine) 3 Must come from a DO or DE. Only wine made from the free-run or lightly pressed juice of ripe healthy grapes, which has undergone a temperature controlled fermentation, qualifies.
Vino de cosecha, or vendimia 3 Wines of a particular vintage year. In special cases, if the purpose is to improve the quality of the wine, a maximum of 15% of wine of a previous year may be added.
Vino espumoso 4 Sparkling wine
Vino Fino de Mesa 3 Fine table wine.
Vino Generoso 3 Special aged dry or sweet wines of higher alcoholic strength than table wines. From the Latin term for excellence. Sherries are vinos generosos.
Vino rosado 4 Rosé wine
Vino tinto 4 Red wine
vino 1 wine
Viura 1 white grape variety
viveza 1 vividness, strength
Vivo 2 lively
Yema 2 yolk
Zarzamora 2 blackberry

I combined four lists. In MSWord I can use different colors and fonts for each list so when I merge them I can easily see where any pair came from, but here in WordPress formatting is more limited so the middle column indicates the source. My extracted list (from all those webpages I processed from both bodegas and restaurants) is 1.  I choose not to provide links for the other three sources, but 2 was certainly the largest.

I eliminated duplication and then used a simple notion of “certainty”. Items from list 1 that are shown here in bold had one or more identical (or almost identical) translation in one of the other lists. This isn’t particularly robust definition of certainty but it will do for this proof concept.

So of the 171 terms in the merged list (82 are from my manual extraction, the remainder from one of the other three lists) only 24 of my extracted terms get marked as “certain” due to occurring in other lists:

afrutado[s], barrica, botella, cata, cosecha, elegante, equilibrado, fermentación, final de boca, fresco, maceración, madura, mosto, postgusto (posgusto), roble, sabor, sedoso, tanino, untuoso, uva, variedad, vendimia, viñedos, vino

There could have been some more since I did not extract really obvious terms from my corpus, such as blanco or seco or dulce or uva. And two of the “confirmed” terms actually are in dispute. Once source admits afrutado is used for ‘fruity’ but this is actually wrong and the term should be frutal. The dictionary confirms afrutado does mean ‘fruity’ but this does not confirm it is the correct term to use in a wine context. Likewise it confirms frutal to be fruit or fruit tree but doesn’t mention how this would be a taste term for wine. So who knows? Which is right? Wine terminology (in English) sometimes contradicts the more common meanings of words since wine tasters understand a particular word in a particular context (and we amateurs just have to learn what they mean). So it’s certainly possible this source might be right BUT how would this ever be confirmed.

Likewise postgusto (clearly ‘after taste’ from context) doesn’t appear in any dictionary. And, in the other lists it appears but is spelled posgusto. Now I’m not sure if this meets the definition of neologism, especially as ‘post’ can mean ‘after’ (in this context) in English but doesn’t occur in Spanish whereas is ‘taste’ or ‘flavor’ so does this word actually exist (or get used in wine documents) and which is the appropriate form?

There was also some conflict between viñedos and vina.  Both are in the dictionary as vineyard but only vina is listed as vines. That is then potentially a flaw in my extraction of pairs since I saw viñedos clearly translated as ‘vines’ in a human translation, but, of course, that person may confuse these two terms.

The term I’m happy I was able to figure out (lots of examination of text to reach my conclusion) is final de boca. This literally would translate to ‘end of mouth’. but it’s more accurate to translate it as ‘finish’, which is actually one of those terms where its usage in wine descriptions has quite different meaning than its common meaning. And one of the lists pronounced that just final is sufficient for ‘finish’ which is one of the literal translations itself. OTOH boca itself has some ambiguity.  It literally means ‘mouth’ but was commonly translated as ‘palette’ in the human translations. That’s not any of the literal translations of ‘palette’. But, again, palette is a word that has different meaning in wine tasting context than its more common meanings.

So, this is all human analysis, with a lot of trial-and-error, back-and-forth, looking in dictionaries and doing web searches. In this contest of John Henry and the machine I think man will win so I really wonder how effective any AI (or just statistical analysis) can be. OTOH, ‘man’ needs to be a fluent Spanish speaker who participates in Jurado de Cata (wine judging panel) and I fall way short of that. But, still, what is the chance I can still produce the best list of wine terms freely available on the Internet? Pretty good, I’d say (given few are even trying).



Something different – wine label and description

By coincidence I decided to get some good wine for our Valentine’s Day dinner. We cook ourselves because: a) we actually can cook some things better than restaurants do, and, b) we spend our money on the ingredients, not the restaurant’s labor and real estate. So off to Whole Foods for some very good wines at the same price as medium wine with restaurant markup. There isn’t a lot of Spanish wine available here. Trader Joe’s has some amazing values, cheap but tasty Spanish wines, but for a Reserva Whole Foods was my only option. Since I bought this wine I’ll allow myself to link their image.




from Bodegas de los Herederos del Marqués de Riscal (website)



Visit the website link I provided as this is an interesting place, very oriented to visitors and with a striking Frank Gehry designed hotel and elegant restaurant.

But finding this wine, first in my little used PeñínGuide to Spanish Wine 2016 which led to the website, gave me an opportunity to look at some translation issues related to wine. For the wine I bought there is a PDF for Spanish and another for English which appears to me to definitely be a human translation, thus providing the rare opportunity to compare side-by-side Spanish, human English, and computer English. For example:

Antes de salir (lit: go out, leave) al mercado tiene (lit: has) un period mínimo de afinamiento (lit: refinement) en botella de un año. Before release for sale it spends a minimum of one year rounding off in the bottle; time enough to show how much complexity tempranillo is able to achieve.

{Before going on the market, it has a minimum bottle-tuning period of one year.}

[Before going on the market, it has a minimum refining period in bottle of one year.]

I did a few dictionary lookups and noted the translation in the Spanish as (lit: whatever). The first English translation is the human one directly from the PDF. This has a definite clue that it’s human translation since the English includes an additional part (underlined) that has no match of any kind in the Spanish so the author chose to add this bit.  The {whatever} part is the translation done by (actually Microsoft) and the [whatever] part is the translation done by Google (had to paste the Spanish in my own test page at this blog since PDF’s don’t get processed by Google in Chrome).

For me there are a couple of interesting issues in these translations:

  1. ‘Before going on the market’ seems to be a more “accurate” translation of Antes de salir al mercado BUT the human translation “Before release for sale” might actually be more accurate, i.e. this wine might not have literally gone to a mercado in order to be sold.
  2. period mínimo de afinamiento en botella is interesting to see the three different corresponding English: [human] “minimum of one year rounding off in the bottle”, [spanishdict] “minimum bottle-tuning period”, and [Google] “minimum refining period in bottle”. When I look up afinamiento I get refinement which Google uses (also the closest to word-by-word literal translation); I think this is definitely better than ‘tuning’ (no idea where that came from) and perhaps better than the human ’rounding off’ ‘period’ is omitted in the human translation but literally present in Spanish and both machine translations.

So let’s look at some more for this, some simple differences in the human translation versus literal lookup or machine translation:

VARIEDAD DE LA UVA (lit: variety of grape) VARIETY USED
GRADOS (lit: degree or grade) 14,1º ALC./VOL 14,1º

Grados is probably not a translation issue, just a different description used in Spain versus the more typical one used in U.S. (although note the English is British, not U.S. English so who knows what this might mean, as in possibly a legal labeling requirement somewhere).

MARIDAJE (lit: marriage, combination,  union) FOOD MATCHES

And this is another interesting turn of phrase. In U.S. “food matches” might also be “food pairings” and, in a stretch, “married” might be used in this context. With only this single sample I can’t draw any conclusion but I find it amusing language to use maridaje for this meaning.


Again, the human translation is definitely not very literal but carries the meaning just fine and frankly I’d prefer the English term (which literally translates to the bulky mejor servido en),

ATRIBUTOS (lit: attributes) GUSTATIVOS (lit: taste) APPEARANCE (lit: aspecto o apariencias)

This one, however, is a little misleading (I think) to switch from ‘taste attributes’ to ‘appearance’. The text (see some below) under this heading covers: color, nose, tannin and finish, a mixture of sight, smell and taste sensations so ‘appearance’ is a bit too narrow to cover all these.

En boca (lit: mouth) es fresco, con taninos pulidos (lit: polish) muy agradables (lit: nice, pleasant, agreeable), con buena estructura pero fácil de beber. Fresh and easy to drink on the palate, good backbone and lovely, polished tannins.

{In the mouth it is fresh, with very nice polished tannins, with good structure but easy to drink.}  

[The palate is fresh, with very nice polished tannins, with good structure but easy to drink.]

The human translation, though useful and pleasing, has little resemblance to the original Spanish (backbone is completely missing in the Spanish). The spanishdict translation is quite literal but definitely gets the meaning across (in wine tasting tannin is almost something you feel on your tongue (pucker) rather than a taste). How Google decided to use palate for boca is surprising – perhaps part of their claim their AI figures out translation via context and while dictionary lookups certainly do not have palette for boca or boca for palette it is appropriate and surprising that the machine translation went down the same path as the human translation.

While there are many more interesting things I’m finding from this description webpage I should wind down and so I’ll just leave you with these bits of the description of the weather at the vineyards for this vintage year (spanishdict translations {xxx} added to human translation.

La vendimia de este año ha estado condicionada, en gran medida, por varios puntos clave sucedidos a lo largo de toda la campaña.

Comenzamos el ciclo con un estado de reservas importante, que se tradujo en brotaciones buenas y viñedos con una carga en general elevada.

La ausencia de heladas primaverales, vientos fuertes en brotación y granizadas de verano, hacen que lleguemos a mediados de septiembre con unas uvas muy sanas y con unos parámetros de calidad que sugerían estar ante una cosecha interesante.

This year’s vintage has been, to a great extent, conditioned by a series of key events during the growing period.  {This year’s harvest has been largely conditioned by several key points that have occurred throughout the campaign.}

We started the cycle with good reserves and this was reflected in good budding and vines which would be heavily laden in general.  {We started the cycle with a major reserve state, which resulted in good sprouts and vineyards with a high overall load.}

The absence of spring frosts, strong winds during budding and hailstorms in the summer meant that we reached the middle of September with very healthy grapes and quality indicators which promised a very interesting harvest was on the way. {The absence of spring frosts, strong winds in sprouting and hailstorms of summer, make that we arrive in mid-September with very healthy grapes and quality parameters that suggested to be before an interesting harvest.}

I will crunch this some more (plus extract even more from this website) to obtain a list of useful terms in describing wine.

That, and drool a bit, at the prospect of actually visiting this place and staying at their hotel and chowing down on their menu but short of winning the lottery that probably isn’t going to happen.

P.S. I found a restaurant (website) that carries the wine (above) and so found a price, 23€, which is about $29 and about what I paid at Whole Foods. But that is a restaurant price (with service) so I’d guess a bottle of this wine in retail outlet (or at the winery itself, quite a touristy place) for around $20 or somewhat less than retail imported into U.S.

Adventure with olive oil

I got bogged down with TMI from sources so I’m going off on a brief digression about aceite de oliva. I stumbled onto an interesting source about this. I was crunching through a glossary I’d found which would provide a comparison source to what I found in the GallinaBlanca diccionario. But with any online source one has to evaluate it to see how accurate it is and in the case of Spanish whether it applies very specifically to Spain. That’s what I was doing when I bumped into yet another glossary. But the site that contained that glossary had numerous “lists” that could be interesting for my project (accumulating a large corpus of food terms used in Spain). So one that caught my interest and is now the digression I mentioned is all about aceite de oliva. The post continues past the table showing what is available at this handy website (which I’ll be analyzing for weeks and future posts):

Términos Gastronómicos Gastronomic Terms
Utensilios de cocina Cookware
Diccionario del aceite de oliva Dictionary of olive oil
Diccionario de gastronomía vasca Basque gastronomy dictionary
Diccionario del café Dictionary of coffee
Glosario del tapeo “tapas españolas” Glossary of tapas “tapas tapas”
Glosario de cocina colombiana Glossary of Colombian cuisine
Glosario de cocina vegetariana argentina Glossary of Argentinean vegetarian cuisine
Glosario de los alimentos Glossary of food
Glosario de los vinos Glossary of wines
Glosario de las Frutas Glossary of Fruits
Catálogo de especies pesqueras Catalog of fishing species
Glosario de las plantas medicinales Glossary of medicinal plants
Glosario de cocina japonesa Glossary of Japanese cuisine
Diccionario culinario inglés-español English-Spanish culinary dictionary

The olive oil dictionary begins with this preface (translation by Google):

Aquí encontrarás algunos de los términos del aceite más usados, y los que más utilizan los catadores de aceite. Te resultarán útiles para entender mejor el mundo del aceite. Here you will find some of the most used oil terms, and those most used by oil tasters. You will find it useful to better understand the world of oil.

Sounds good to me. I’ve done my usual thing of getting side-by-side original Spanish and the Google English translation so I can evaluate each entry (for example):

Aceite vegetal: Es el que se saca de los vegetales tales como coco, maíz, maní, ajonjolí, soya, oliva, etc. Vegetable oil: It is the one that is extracted from vegetables such as coconut, corn, peanut, sesame, soy, olive, etc.

Here the translation is very easy to match up corresponding words and determine this is certainly a good enough translation. But we’ll look at a few that have challenges.

But first what about “olive” itself. It turns out there are multiple Spanish words for this with subtle difference: olivo, oliva and aceituna. Now having the masculine and feminine version of the basic noun oliv• has parallels in other cases (but not, as I researched, an absolute rule). The masculine form refers to the plant (or tree in this case) and the feminine form refers to the fruit of this plant/tree. So that explains that, but then should we use aceituna (which is a feminine noun) or oliva? I can’t find a definitive answer to this but I do get a general clue: oliva is going to be used as a qualifier as aceite de oliva (and maybe other cases) – IOW, a generic reference; whereas aceituna will refer more specifically to an actual olive (like you’d get in a tapa).

So with that settled we can move on to some challenges in translation. Now as I’ve said before I’m not picking on Google (or any of the other translators) since I think they’re quite remarkable and very helpful. But they have limits and it’s useful to attempt to characterize those. The criticisms I’ve seen tend to come more from a literature perspective and frankly I agree, I think machine translation badly botches that more complex language. But for my purposes (translating menus) a glorified literal translation is probably good enough to get the idea. But bear in mind, those of you thinking you can head to Spain with just a smartphone this technology does have a lot of holes.

Now the first one I’ll pick on is simple:

Aceitunada: Cosecha de la aceituna. Olive: Harvest of the olive.

cosecha does literally translate to ‘harvest’ or ‘crop’ (either the process or the season) but aceitunada, which I initially thought was a simple diminutive of aceituna actually appears to have a more robust meaning, which is (according to spanishdict) “The season for gathering olives” (as a noun) or “Of an olive color” (as an adjective). I’m guessing Google just used the adjective translation and shortened it. But given this diccionario is giving us definitions of terms, obviously aceitunada does refer to harvest season, not an actual olive (or some diminutive of olives).

So moving on we have an interesting one here:

Dulce: Aceite de agradable sabor, que resulta dulce por su carencia de amargor, picante o  astrigencia. Sweet: Oil of pleasant flavor, which is sweet for its lack of bitterness, spicy or astrigence.

I noticed this simply because the spell correction objected to astrigence in the English translation. Sure enough, using Oxford as an authoritative source there is no such word in English, so, curiously, why did Google pick it? My best guess is that sometimes Google does a literal translation of individual words via rules (even though they say everything is learning based, not rule driven). But I’ve seen this before and especially in this translation (later in this webpage): caracacterístico as ‘caracacteristic’ – see the pattern, the gross misspelling relative to the correct translation of característica (‘distinguishing feature’ sense) or  característico (‘typical’ sense) to characteristic (also  atribbuto translated as attribbute, where did that come from, rule or bad learning set) – that looks like an algorithm to me, but, of course, it could be “learning” but from an incorrect source. Who knows. But in researching this I spot another issue: astrigencia appears to be wrong. I say “appears” because it is a judgment call I’m not qualified to make to 100% assert this is an error, but I believe it is. spanishdict (somewhat like Google search) considers the possibility you misspell your search term and so corrected my input (directly clipped from the webpage, so not my error) to astringencia which then has the correct English word as translation, ‘astringency’ and the reverse lookup matches. The authoritative dictionary doesn’t have astrigencia but does have astringencia so I’d call that settled but I don’t have the authority to make this determination.

Now just a few simple ones, but demonstrations of how a critical term isn’t done properly by machine translation, which is my point that you can’t rely on current smartphone implementations for critical translations.

Basto: Viscosidad o aspereza que se aprecia en algunos aceites corrientes, dejando en la boca una sensación pastosa. Basto: Viscosity or roughness that can be seen in some ordinary oils, leaving a doughy feeling in the mouth.

Google couldn’t figure out basto but it’s readily available in dictionaries translated to ‘coarse’ or ‘rough’, which does match this definition. So if a restaurant told you the cheaper olive oil was demasiado basto Google is no help figuring that out.

This one was interesting to figure out and led to a couple of fun links I’ll provide:

Almazara : Edificio donde se encuentra el equipo necesario para la obtención del aceite de oliva Almazara: Building where the necessary equipment for obtaining olive oil is located

Again Google couldn’t translate the term of this entry, almazara to what spanishdict defined as ‘oil press’ or ‘oil mill’. That’s a bit ambiguous (is it the piece of equipment or the building) but and Oxford says it is “Mill in which oil is extracted from olives”. Now searching for almazara on the Net revealed these two interesting links: a) a blog post about the “building” (really a factory, possibly of multiple buildings, as described), and, b) the website for one of these that nicely describes how they produce their oil and their oils (I doubt you’d find this particular oil in even the best gourmet markets in USA but you might find some other Spanish brand so these descriptions could be helpful).  You’ve already learned that bodega is winery so now you know what to look for to find olive oil being produced (and possibly sold).

The next one has some interesting subtle issues with translation:

Almendrado : Sabor que recuerda el gusto de los frutos secos y que suele presentarse en los aceites virgen extra del Bajo Aragón y Cataluña Almendrado: Taste that recalls the taste of nuts and that usually occurs in the extra virgin olive oil of Bajo Aragón and Catalonia

Google couldn’t decide what almendrado means but given the definition (the English translation) triggered my memory of almendra (literally ‘almond’, which, btw, follows that masculine/feminine rule – the tree is almendro and the nut is almendra). As a noun almendrado translates to macaroon (not quite what we need here) but as a adjective translates to ‘almond’ (as a modifier, so like an almond cookie not the almonds in the cookie). But, this is an interesting term because in the definition this website provides it’s referring to el gusto de los frutos secos (he taste of nuts) not almonds specifically. Typically fruto seco is the literal translation of ‘nut’, although nuez (following the masculine/feminine rule) is the more appropriate translation of the nut itself (and not the tree). So what I interpret they’re really saying is whether the olive oil tastes “nutty” (or not), not “almondy”. [Note: See how Google’s AI can learn something wrong (as mentioned above) given the source it is learning from uses non-words like almondy]

And I’ll wrap up (for now, there is a lot more in this source to discuss) with this:

Alpechín : Líquido acuoso residual que se obtiene del proceso de elaboración del aceite. Comprende el agua de la aceituna, el agua de adición y de lavado y un porcentaje variable de sólido. Alpechin: residual aqueous liquid obtained from the oil production process. It includes the water of the olive, the addition and washing water and a variable percentage of solid. {6}

Google gets off the hook for not translating alpechin since there is no equivalent word in English. For this word the online dictionaries provide a definition, not a translation, and spanishdict says this “water that oozes from a heap of olives” which is a short version of what this olive oil diccionario is telling us. I doubt alpechin would ever be served in a restaurant, but who knows, maybe someone thinks it’s cool and so I’d be glad to have this term in my much smarter app I’m going to build.

And I’ll leave you with this. If you do even a little cooking you already know this but it’s interesting to see the actual description:

Virgen: Aceite que no ha sufrido artificio en su formación. Virgin: Oil that has not suffered artifice in its formation.

No ‘artifice’, eh!


Merluza a la Vasca

by Penelope Casas or Merluza y Almejas en Salsa Verde (by Teresa Barrenechea). These recipes are very similar and I made a version of these for dinner tonight.

I’ve been falling behind a bit. My research into Spain’s culinary vocabulary has been a bit slow of late. And I have some painful issue with my toe that I’ve decided to give it a rest and therefore am not accumulating any mileage along my virtual trek of the Camino. I suppose I’d just have to push through it if this trek were real but I have the luxury of experimenting with various things that might alleviate any pain. After all I don’t want to injure myself in just practice and training. The idea of some physical limitation hitting on a real trek is discouraging but one thing to deal with realistically.

So when it was suggested I should make dinner with some frozen hake we have in freezer. I immediately looked for recipes of one of Spain’s most popular fish, merluza. The package lists the fish we had as Hake Loins (Merluccius), wild-caught as a product of Namibia, not exactly the Bay of Biscay (and not fresh) version of merluza. But this is probably the closest we can get here.

Both recipes, despite one emphasizing Almejas (clams) in the title, call for fresh clams (and Casas wants mussels as well). No such luck here in middle of winter in the midwest. But we did have clam juice which is generally good as a substitute for fish stock in cooking. The Barrenechea recipe calls for cooking the clams separately and then using the reserved cooking liquid for cooking the hake so using clam juice instead is close.

At least a couple of ingredients come from our garden ( huerto) – the parsley was harvested during the summer and frozen and the lemons are actually growing on a lemon tree in our atrium (with snow on the ground outside). When I lived in California I had a lemon tree in the backyard, here it lives inside but produces some very nice lemons, weird to see with freezing temperatures outside.

This is a bit different approach than I first learned from a Julia Child recipe (Filets de Poisson Pochés au vin Blanc) that is a killer recipe. In the Spanish recipe the fish is lightly fried prior to poaching and then the poaching liquid has already been thickened with flour (in the Julie Child version the poaching liquid has no starch and is reduced and thickened to make sauce after the poaching). That, plus frozen fish, made it a bit uncertain when the fish would be properly cooked. I extended the 20 minute cooking time by another 5 minutes and it might have been good to go just a bit longer. Frozen hake is a long way from the fresh petrale sole I used for Julia’s recipe but it was acceptable. Hake (and sole) is pretty mild in flavor so most of the flavor comes from the sauce.

So this was a decent little dinner but I wonder what it would be like in a really good restaurant in northern Spain. Hopefully I get the chance to find out.


Verbs again

In my previous post (about finishing initial processing of GallinaBlanca dictionary) I mentioned that verbs can be of some use in interpreting menus, possibly through derivatives of the infinitive form of the verb. So I’ve continued to do some digging in this area and have a few results to share.

Anticipating I’d be looking at verbs, independently of extracting them from the GB dictionary I used about nine online “lists” to compile an aggregate list. These verbs: a) may have nothing to do with cooking or cuisine, b) tend to be more commonly used verbs, and, c) may not be used (at all, or in same way) in Spain. So this is the list I’m calling C.

In the process of other searches I stumbled onto a culinary glossary. It has no connection with Spain and therefore the Spanish words might come from any part of the world. And as I worked with it more extensively and carefully I observe many of the issues with online resources of unknown origin: a) misspellings (probably, don’t want to jump to conclusion just because words seem to be misspelled), b) duplications, often including the singular and plural form, c) words that make no sense appearing in Spanish culinary dictionary (how did these drift in), d) inconsistent formatting and thus order (e.g. A la cazuela vs Cazadora, A la). In a previous iteration of my project I created a “glossary” by merging information from many sources and eventually it became a pisto (hotchpodge, if I can use that word in a non-culinary sense), especially losing any notion of whether the words applied to Spain or some other Spanish speaking area. So with these caveats I’ll call this list G.

And I have my list of verbs from the GallinaBlanca dictionary which I previously described. I’ll call this list D.

Now, simply, it’s too much work to compare the entirety of all three of these lists so I just did the subset (verbs only, of course) of verbs starting with A B or C. While this may be a biased sample it still reveals some interesting information.

Sorting the three lists together (with different fonts and colors for each list so I can distinguish) then I did manual processing to consolidate like terms together. As a result I ended coding each entry with GDC (or – if not in that list). So I generate the following table:

G– 44
-D- 4
–C 35
GD- 28
-DC 1
G-C 9

There are 126 verbs that appear in at least one of these lists. Only 5 verbs appear in all three lists. The list with the largest number of unique verbs is the G (glossary, 44), which thus indicates this is potentially very useful as it adds over 50% more verbs than I had previously found.  The verbs in the C (common) list may have nothing to do with cooking or food (we’re explore that later in the post) so this may not add much. Only 5 verbs from the GallinaBlanca list don’t appear in the glossary list so whoever compiled that got most of the cooking verbs.

So looking at the verbs that are only in the C (common) list and not in either cooking related list we do see a few surprising omissions (I’m assuming that these are SO common no one bothers to include them):

abrir –C to open; to turn on; to whet (as in appetite)
agregar –C to add
añadir –C to add
beber –C to drink
calentar –C to heat, heat up, warm up; to inflame
cocinar –C to cook
combinar –C to combine, mix; to put together, match, coordinate
comer –C to eat; to have for lunch; [Latin America] to have for dinner
concinar –C not in any dictionary, probably misspelling of cocinar
convertir –C to turn into, convert into, change into, make
cortar –C to cut, cut off, carve, slice, cut out; to chop; to cut (dilute sense); …

So out of the 35 verbs in the C (common) list only I’d probably include these 11 in a general purpose culinary list.

Now some of the verbs in the G (glossary) don’t appear to be useful. Some have no definition in any of the dictionaries I routinely use, including the most authoritative of the Spanish language (which is NOT limited to Spain so could include verbs that don’t get used in Spain).  So here are a few I’d consider dubious to include in a culinary glossary:

achicalar G– [Mexico] to cover in honey; soak in honey
añejar G– to age; [vino] to mature; to get stale
apanar G– to coat in breadcrumbs (also EMPANAR or EMPANIZAR)
apuntillar G– to finish off (a toro); to round off
ataviar G– to dress up
bardar G– to thatch
blanchir G– (not in dict) Wiktionary has it as a French term for make white
bresear G– (from glossary) To cook to slow fire, during long time, with condiments (generally vegetables, wine, broth and spices). Clearly a spelling error since not found.
cantar G– to sing; to crow, chirp
caramerizar G– (not in dict), another spelling? [from glossary] Spread a mold with sugar honey.
castigar G– to punish; to ground, keep in; to damage, harm
cerner G– to sift, sieve (same as cernir, which is it?)
chapurrar G– to speak badly

I wouldn’t include achicalar as it doesn’t appear to be used in Spain but this is a good point about my goal here. If I wanted to know the Spanish word, used in Spain, for an English word, I wouldn’t include anything that may be only used outside Spain. But my goal is asymmetric – to translate Spanish (on menus) only into English (so I can choose) so including a word in my corpus (and eventually my app) that is not likely to be used in Spain is not a problem (I do need metadata to note this however, for that term). If I never see the term it does no harm to never have it found in any lookup. OTOH, it would be a problem if I’m trying to translate English into Spanish, as in don’t use a word not found in Spain. It appears, for instance, frijoles, which is well-known to most in USA who visit Mexican restaurants is one such word, not commonly used in Spain, but possibly likely a Spaniard would know the word. That might lead to a scene (from The Way) like no tapas in Navarra, only pinxtos, and thus make you look foolish.

blanchir (to make white, which isn’t exactly synonymous with blanch but one might assume that’s what this means) was interesting in that it did not occur in any dictionary but did have an entry in Wiktionary. The standard term  for blanch is palidecer (purely in the sense of turn white) and escaldar or blanquear for the culinary sense. I suspect  blanchir might be used somewhere (possibly Puerto Rico) where it is just the cognate of the English verb. But, again, in collecting the corpus I should not make judgments like this although I might add metatext to an blanchir entry and meanwhile add it to corpus and then let the “big data” statistical analysis decide if this is a word or not.

bresear really looks like a misspelling (more likely to be brasear, to barbecue) but again it should go into the corpus with metadata notion rather than my passing a judgment on it (IOW, only a real expert in Spanish should be decided what to include or not in any translation dictionary, so if I find only one instance of a misspelled word it will get washed out since there are few occurrences of it in the corpus; OTOH, maybe people do commonly misspell this word so it needs to be in my app). caramerizar appears to be some variant of caramelizar, again perhaps used somewhere and not just a mistake. cerner has exactly the same definition (in the glossary itself, but also spanishdict) as the more common spelling cernir, although both appear in reverse lookup of ‘to sift’ in spanishdict (which is it, then? just a common confusion?) cernido is a possible term to see on a menu so it matters that my dictionary could spot this as past participle of cerner.

So again all this goes to show the work that must be done to really develop a very accurate dictionary that drives my app for menu translation (or to be published as a carefully researched culinary glossary).




How to use collected menus

I use this blog to document a project I’m doing which is to obtain an accurate and comprehensive set of terms (isolated words and phrases) to feed a smartphone app so I can “read” menus in Spain. To do this I am first collecting menus on my virtual “trek” (translating miles on a treadmill to position on the Camino de Santiago) and using Google map’s POI to find restaurants and then process those that have websites with some form of menu I can just extract (don’t want to be typing from images and make all those mistakes).

Most of the menus are in Spanish (rarely I can find one that is dual language, and even then: a) their translation may not be so great, and, b) the English menu may not be the same, so this can be tricky). So I use either Google translate (if the menu is standard HTML webpage) or some tedious copy-and-paste to use (really Microsoft) to translate. Of course these machine translations are often not that great (both wrong and miss many terms) and that is a big issue.

Doing this process is fairly mechanically tedious but doing it slowly also gives me a chance to really observe what is going on (plus get a bit of drill on words, my short-term memory of some Spanish terms is increasing, but based on past projects I know I’ll retain little of that). And, as I’ve documented in some posts occasionally menu items complete befuddle the machine translation which sends me off trying to figure it out myself, an interesting challenge since I have next to zero fluency in Spanish.

Now it is important to note my goal. Learning to speak and hear Spanish is entirely different, especially if you want to have conversations about almost anything (even if still oriented toward travel). I just need to be able to read menus (at least for my limited goal) and choose what I want. And I don’t need to translate in the other direction, so knowing whether ‘mushroom’ is hongo or seta doesn’t matter as much as going the other way.

And, of course, this also does imply knowing something about cuisine in Spain (which can be quite different than what we might encounter in restaurants in USA that happen to use Spanish on their menus). And it is turning out to require knowing something about agriculture in general in Spain, especially in different regions. An ingredient, like chorizo is: a) quite different than the Mexican style chorizo I’d find in markets or restaurants here, and, b) somewhat different in different regions in Spain as each has its own traditional way of making something like chorizo.

So after extracting menus from websites with some sort of translation I end up with side-by-side menu items, like below:

Gambas a la Plancha Prawns on the Plate
Setas a la Plancha Grilled mushrooms
Espárragos Especiales “Dos Salsas” Special Asparagus “Two Sauces”
Ensalada Templada con Gulas y Rape Tempered Salad with Gulas and Rape
Cogollitos de Tudela con Anchoas y Salmón Tudela with anchovies and salmon
Tabla de Ibéricos Iberian Table

I choose these particular items to make a couple of points:

  1. Notice that a la plancha occurs in two consecutive entries and given gambas are prawns and setas are mushroom that means there are two different ways, to both parse and assign a tentative meaning to a la plancha (either ‘grilled’ or ‘on the plate’ (more literal). So what does it really mean? Answer, btw, is that plancha is really “iron” which means a cooking device, either pan or typical restaurant flattop is used to “grill” the item.
  2. In the fourth item gulas appears (and didn’t get translated) and rape is quite ambiguous (is it the English word and therefore shouldn’t be translated or is it a Spanish word that means something entirely different?). gulas are baby eels (or possibly synthetic “worms”, like the fake crab) and rape is a type of fish with more than one translation (monkfish, anglefish).  So how can I use information like this?
  3. Cogollitos de Tudela got translated just to Tudela (the other words in this item are easy to match the Spanish and English). This is actually a flaw (I believe) in Google translation process. Cogollitos is looked up to get “A small heart or flower of garden plant” (or sometimes, just ‘buds’) and Tudela doesn’t appear in any dictionary but turns out to be a town (really just a reference location) where a particular type of lettuce (looks like Romaine) is grown and when served at restaurant the inner leaves are used (often in very attractive presentation). So this is a fairly classic ingredient and dish, especially in northeastern Spain but translation isn’t going to help much. So, a) how certain am I that I’ve figured this out correctly (or even how would I put some certainty on it, like how many different sources I found that confirm my guess at what this is? versus any counter-evidence), and, b) how should I use this information in my corpus.
  4. And what is “Iberian Table”? (a valid literal translation but not helpful). Now doing even a little research on menus one quickly learns that Ibéricos almost certainly refers to a prized pig but how is it connected to Tabla? Sometimes one has to be careful here as I’ve already found an instance where silla (literally ‘chair’, but in the context, really ‘saddle’) refers to a cut of meat so maybe the same is true with tabla? IOW, there is quite a lot of uncertainty here BUT this could be an important item to know.  I suspect, BTW, it’s just a plate with some ham or other cured pork, like an antipasta.

So there are several steps in studying menus:

  1. the mechanical part of getting the Spanish aligned with some sort of translation to English
  2. studying the results for what appears to be clear one-to-one correspondence in terms. But beware – on this single menu both hongos and setas translate to mushrooms? Why are there two difference words (previously hongos had shown up as primarily used in Latin America, not Spain, but obviously this menu contradicts that). And if there is a difference (i.e. they’re not just synonyms) what is it. I have vague evidence hongos refers to cultivated button mushrooms and setas to wild mushrooms (like shiitake or others). That is a big difference.
  3. Some items translate very little and therefore can I find other sources to determine what these items might be? (sometimes yes, sometimes no) And even if I figure out what a word (e.g. Cameros from yesterday’s post) or phrase (a la riojana from yesterday’s post) is, these are not literal translations so how do I mark these. For instance I believe  refers to the mountains in southern Rioja and therefore potentially a breed (or just the husbandry of) sheep that would be recognized as distinctive (like Waygu beef). If I figure this out: a) what confidence do I put on this information, and, b) how to I encode this information in my corpus.

Once a corpus is obtained the assumption is a kind of “big data” can help figure all this out (I haven’t quite figured out what code I’ll write for this, Google claims complex deep-learning AI as their method of training their translation and I don’t have the resources for that approach). But my assumption is that everything in my corpus will have multiple entries and some a lot of entries. So in conjunction with my placing some sort of “certainty” weight on each pair and matching up pairs across a large data space some sort of overall certainty can be derived (probably with a lot of exceptions that have to be looked at my human evaluation which Google says they never do, which also might explain some of their odd translations).

So, just to finish this let me provide an example. From this single menu I extracted (manually, can’t quite imagine how to do this in code) the following table of “pairs” where I’m relatively certain these are correct. IOW, these are mostly just the terms derived via literal translation not the more complicated cases where a lot of guessing is required.

Note: more discussion after this table, please scroll down.

a la Plancha Grilled; on the Plate Lechal Baby lamb
a la Vinagreta Vinaigrette Lenguado Sole
Agua Water Limón Lemon
al Horno Baked Macarrones Macaroni
Albóndigas Meatballs Menestra Stew
Anchoas Anchovies Merluza Hake
Arándanos Blueberries Milhojas Fillets
Arroz Rice Mixta Mixed
Asado Roasted Oveja Sheep
Bacalao Cod Pan Bread
Bebida Drink Patatas Potatoes
Berenjena Eggplant Pato Duck
Bistec de Ternera Beef Steak Pescados Fish
Calabacín Zucchini Pimienta Pepper
Calamares Squid Pimientos Peppers
Carne Meat Postres Desserts
Carrilleras Cheek pieces Precio Price
Cerveza Beer Primeros Platos First courses
Codillo Knuckle Puerros Leek
compartir share Pulpo Octopus
Cordero Lamb Queso Cheese
Croqueta Croquettes Rape Anglerfish
de la Abuela Grandma’s Rebozado Coated
de la Casa of the House Refresco Soda
elegir choose Rellenos Stuffed
en su Tinta in ink reservas reservations
Ensalada Salad Revuelto Scrambled
Entrantes Starters Rojo Red
Espárragos Asparagus Sabores Flavors
Fresco Fresh Salsa Sauce
Frutas Fruit Setas Mushrooms
Gambas Prawns sobre on
Gaseosa Soda Solomillo Sirloin
Guisado Stew; Stewed Tarta de Queso Cheesecake
Helado Ice cream Tomate Tomato
Hongos Mushroom Trucha Trout
Huevo Egg Verduras Vegetable
Incluye Includes Vino Wine
Jamón Ham Yogurt Griego Greek Yogurt
Judías Verdes Green Beans

So a single menu provided a significant (about 80 items) source of raw material to feed into my corpus. Now I’ll just note a few things as to whether further processing should be applied to this list before adding it to a corpus (or, IOW, what metadata should also be embedded in the corpus).

  1.  Judías Verdes ‘green beans’: Should there be an entry verdes as ‘green’ and judias as beans? Now in Spanish adjectives match their noun in both number and gender so verdes might not be the lookup dictionary form for ‘green’ (it’s not, the singular verde is). So that could introduce some confusion in the corpus. And ‘bean’ has multiple translations which often one word being used for the dried beans (or the seeds in the bean pod) versus the whole bean, as in typical green beans.
  2. What about Guisado ? These had two literal translations: ‘stew’ and ‘stewed’ by Google. And in English those are not the same thing even though they’re related. guisado is the past participle of the verb guisar which can mean either just simple ‘cook’ or also ‘stew’.  The context in this menu for the two uses of guisado are “Cordero Guisado” and “Cordero Guisado con Pimientos” so why is Google convinced it’s ‘stew’ (the noun) and ‘stewed’ (the conjugated verb) in these two contexts. Is it right?
  3. Another thing I noticed is that often the English translation doesn’t match the Spanish in number. Figuring out plural and singular forms in a corpus analysis process could be interesting, so putting in an incorrect corresponding pair could be problematic.
  4. And, finally (for today) nouns probably fit into a literal translation mode easier than other parts of speech, or especially colloquial usage, so trucha as trout is fairly high certainty but what about mixta as ‘mixed’? It was used in the context of ensalada (salad) and that item appears to be a typical mixed salad (often “house” salad in US restaurants) but the literal translation of ‘mixed’ would be more likely  variado or diverso; mixta doesn’t occur in lookup dictionary at all, but mixto does in the sense of mixed of both sexes (i.e. a group of people), so why did the salad menu items decide to use feminine form or even mixto at all?

So there are lots of challenges, both extracting the raw data itself, assigning some metadata to the pairs to qualify how they should be treated in the corpus and especially assigning some certainty value (i.e. like a probability, where 1.0 would probably never occur (there is always some ambiguity) and 0.0 is meaningless to even include BUT maybe a single scalar value is insufficient since it’s possible to have high incompatible, in not even mutually exclusive, interpretations).

So all of that is a lot of design work to do and then probably an iterative process once I get some code that can crunch the corpus (thus far, I’ve done some by hand to look for design issues). And, fundamentally, is this even a process I can automate at all or at most the code just brings together related pairs for me to analyze with my intelligence.

Who knows, time will tell.

p.s. [personal]. Doing this mechanical work (and some background study as I go along) and also writing these posts is definitely cramming some Spanish into my brain, but I also know that’s a short-term effect. A year from now I’m not going to remember guisado is the past particular of guisar or that it is related to stews/stewing (as cooking process). So converting this work into: a) a more permanent and usable form (like a smartphone app to carry with me to Spain), and/or, b) creating some drill programs so I could “brush up” just before leaving has a more useful effect.


A la Riojana

My virtual trek has now taken me just past Nájera in La Rioja and there is one restaurant there, Los Parrales, that offers the following menu (plus individual items with a la Riojana as a modifier): [Note: translations are from Google Translate of webpages]

Menú Típico Riojano Typical Riojano Menu
Los sabores más tradicionales de La Rioja The most traditional flavors of La Rioja
Descubre la gastronomía riojana de la mano de nuestro menú típico riojano. Discover the Riojan gastronomy hand in hand with our typical Riojan menu.

If you’ve ever gotten wine from Spain you’ve heard of Rioja. This is the best-known and probably premier wine growing area. Like Napa (which is a region, county and town) La Rioja is a political entity, an autonomous community of Spain, consisting of a single province. The capital is Logroño which was my previous stop on the Camino. The wine region of Rioja is not exactly the same area as the political entity but roughly aligns with it. And Riojana is the demonym of people and things from this region. A la Riojana is a designation, used with food, to indicate the preparation is the one typical used in Rioja. This is similar to Italian practice, e.g. a la Bolognese (a meat-based sauce originating from Bologna).

But what is it?

For me to answer, neither being there in person nor an expert in either Spanish language or cuisine of Spain is a bit of a stretch, so I suggest you find other sources (I’ll be providing some), especially from anyone who is describing their personal experience with a la Riojana.

This article, while in Spanish, has a better description than I can provide. Teresa Barrenechea lumps La Rioja together with Navarra and Aragón, emphasizing the connection to La Ribera del Ebro (the second major river of the entire Iberian Peninsula; ribera is its riverbank, the obvious fertile area for growing crops). While wine is the hallmark of Rioja it is not used, directly, in the cuisine. Instead the cuisine is dominated by vegetables which grow well in the same conditions as vineyards. The cuisine uses less of the fresh seafood of further north but a bit more lamb and beef. It is simple and hearty.

Perhaps the most classic dish is:

Patatas a la Riojana Potatoes Riojana’s style

This is a fairly simple stew (description and typical recipe) of potatoes and chorizo (riojana version) seasoned with ample paprika. It definitely seems to meet the simple and hardy description that is characteristic of much of a la Riojana. It’s amusing that one of the links (for receta) I provided actually uses patatasriojana as the domain name (I guess someone thinks it’s famous).

In terms of fish this also seems to be a classic (description and typical recipe):

Bacalao a la Riojana Cod to the Riojana

I ate numerous cod dishes in Portugal (from desalted salt cod, interesting to see huge piles of it in markets) and, well, it’s pretty blah. The tomato sauce and at least some hint of pepper might elevate this dish above blah levels, but it seems hard to get excited about it.

A more interesting meat dish (again that seems to be a Riojana classic, although not on the menu of this restaurant is las chuletas al sarmiento (chops with the vine shoot, description) which is roast lamb but using the trimmings for grape vines as the smoking wood. The restaurant did have a special menu focused on this dish:

Menú Típico Riojano Especial Lechazo Typical Rioja Special Lechazo (lit: suckling lamb) Menu
Lechazo de Cameros Recién (lit: newly or recently) Asado Newly Roasted Cameros
A elegir de los primeros platos y postres del Menú Típico Riojano Choose from the first dishes and desserts of the Riojano Typical Menu

Now lechazo is a preparation (English link, Spanish link) [and also an alternative term] of cordero lechal . cordero is a lamb (in general), and lechal (derived from leche (milk)) imply a very young and unweaned lamb. So this menu is a variation (at this restaurant and therefore for a certain price) of their Menú Típico Riojano where the lechazo is the required segundo plato.

But it took some looking to finally conclude (as best I can) that the de Cameros (tough to search since a car gets most hits) refers to a geographical area within Rioja, i.e. in and around the Sierra de Cameros in in the south center of La Rioja (in the region of La Rioja Media). Presumably the sheep from this area must be special enough that they’d be labeled on the menu. But this is another typical challenge of deciphering menus.

And then there was this item on the menu:

Revuelto Riojano Scrambled Riojano

I’ve actually mentioned revuelto is a previous post but the images and recipes I find for this seem to be showing off the vegetables that also characterize Riojana. Here is a recipe focused on a version emphasizing peppers. And another recipe that seems to have a bit of everything in a big pile which seems to be merging revuelto and pisto (I love this translation of pisto, hotchpotch, or more conventionally ratatouille).

So that’s our brief tour of La Rioja. Too bad it’s only words since sights and smells and tastes, in real life, would be some much better.

Note: This post took me so long to get it published I’ve passed Nájera and blown into the next town, Azofra. It has a couple of restaurants labeled on the Google map but none (thus far) that appear to have websites to extract their menus. But as a recommendation, Dear Reader, it’s handy to look at these places on Google maps because they collect many photos of food for each restaurant that is a POI. Here is a good interactive map with a GPS trace of the Camino de Santiago.



Finished the GallinaBlanca Diccionario

I’ll explain what “finished” means in a minute but first I am almost at another milestone in my journey, so 1/2 mile outside Nájera, about 20 miles from Logroño and about 60 miles to reach Burgos, on my virtual  camino trek. That is since I’m stuck here in the cold midwest USA I do miles on my treadmill in the basement (training for the Camino, I wish!!!) and translate those boring miles onto a GPS track of the Camino de Santiago and then, most of the time, do a little “walking” courtesy of Google StreetView (the Camino is hardly a wilderness trail if a Google car is driving on it).

So what does it mean that I say I finished the GB dictionary. Well it means the tedious part is over. Their dictionary is provided via Javascript popups and one page for each letter of the alphabet and thus: a) there is no way to easily grab all the terms out of the HTML, and, b) Google Translate doesn’t operate on the popups. So I have to manually click each term, use mouse to get the text of its definition in Spanish, paste that in my MSWord document and in the webpage, get the translation (which it turns out seem to actually be provided my Microsoft; I tried the translation built into MSWord itself and it was pretty ragged), mouse that translation and then paste in the side-by-side table. Then I take the term and attempt to get a simple literal translation (more pasting, possibly into three different webpages).

Needless to say this is big-time tedious (and slow) and that’s what I’ve finished. It may be tedious but going slowly through the list means I take the time to study each result. Often even from the English translation of the definition of the term I really don’t know what the English word would be, which makes that lookup sometimes a surprise. Since this is a specialized vocabulary for cooking many of the terms are more obscure and thus missing in dictionary lookups so it’s off to doing searching and guessing and trial-and-error until I get a reasonable answer. Lots of work but a good learning experience.

So now I have that “done” (probably a few mistakes I’ll have to clean up). So I have pages of stuff like this:

HERVIR (literally boil) Cocer en líquido a una temperatura de 100º. Cook in liquid at a temperature of 100 º.
HORNEAR (literally bake) Cocer en el horno mediante calor seco. Cook in the oven with dry heat.
HUMEAR (literally smoke or steam, and one sense is exactly this definition? ahumar is the culinary verb) Se dice cuando el aceite desprende humo, indicando que está caliente, a punto. It is said when the oil emits smoke, indicating that it is hot, ready.
INCORPORAR (literally incorporate, add, include and mix in) Agregar, unir algo a otra cosa para que haga un todo con ella. Add, join something else to do a whole thing with it.
INSTILAR (literally instill) Echar poco a poco, gota a gota, un líquido en otra cosa. Slowly pouring, drop by drop, a liquid into something else.
LAMINAR (literally laminate) Cortar en láminas muy finas. Cut into very thin slices.

So what am I going to do with this now?

I deliberately picked a chunk of the dictionary that is all verbs because that’s my first attempt to create something derived from this list. There are a lot of verbs in this dictionary because it accompanies recetas (recipes) and these verbs (in some conjugated form) probably occur in the collection of all those recetas. So GallinaBlanca is nicely helping cooks read recetas that might contain a verb they don’t know. There are some fairly obscure verbs in the list.

Now what has this got to do with reading menus which is the focus of my project. Rarely are the menus (at least the list of items you can order) going to have complete sentences explaining the food (perhaps a brief, just a phrase, description). So verbs don’t much matter.

Or do they? A word you will frequently see on menus (even in name of restaurants) is asado.  This is grilled or roasted (as an adjective perhaps modifying some noun) or even just a noun in its own right, grill or roast. But this word has its root in a verb, that is asar (in the infinitive form, i.e. the typical word to lookup in a dictionary (Note: Online dictionaries are often smart enough to handle conjugated forms but typical non-interactive dictionaries (paper or smartphone) require you to see this is a conjugation of a verb and deduce the infinitive form to do the lookup – not easy if you’re unfamiliar with Spanish).  asado is the past participle of asar and as Spanish verbs are far more regular (some exceptions) than English this is almost an algorithmic rule to form past participle from infinitive very (like to baked and baked as a regular case in English). So in a quick extract from my list here are a couple more examples: hervir (to boil) hervido (boiled), estofar (to stew) estofado (stewed), picar (to mince or chop)  picado (minced).

So knowing some cooking verbs could come in handy. Memorizing them all is probably a waste of time but as I intend to collect everything I’ll need this in my smart app that is going to translate menus (having all the conjugations is then easy as well).

But I don’t like to depend on a single source for literal translation (each verb to its most direct English equivalent). Plus some verbs have a ton of different meanings and they are not always labeled as being the culinary sense in every dictionary. And some verbs don’t have much connection, given GallinaBlanca’s definition to the standard (at least online) dictionary definitions. For instance, this tough one to figure out:

ALBARDAR (literally: to saddle, put a  packsaddle on)  Envolver piezas de carne con lonchas finas de tocino, para evitar que se sequen al cocinarlas. Wrap pieces of meat with thin slices of bacon to avoid drying when cooking.

I suppose one might deduce that wrapping meat with bacon is “saddling” it, but really the clue comes from this:

Saddle is a butchery term that refers to the meat that is at the animal’s back and hips. Think of it in terms of the meat that would be in more or less the same place as a saddle on a horse.

I’ve done a fair amount of cooking (and reading cookbooks) and ‘saddle’ as a cut of meat never registered. Or what about this one:

CINCELAR (literally chisel, carve, engrave) Hacer incisiones en una pieza (se utiliza sobre todo para pescados) para facilitar su proceso de cocción, generalmente en los asados. Make incisions in one piece (mainly used for fish) to facilitate their cooking process, usually in roasts.

I’ve done exactly this cooking fish (and more so bread) but I don’t think I’d use any of those literal English verb equivalents to describe the process.

So there is a lot of learn from these verbs. And as I said I don’t like single sources so I sometimes use a page here in this blog (test data) to paste some Spanish in, view that page, and then fire up Google Translate (maybe there is some simpler way but this works without too much hassle).

Now what I’ve read about Google Translate context matters. So a pure list of verbs, especially in infinitive form eliminates any possibility of a contextual AI-ish translation and thus is just a simple literal translation. For verbs with many meanings there is nothing to clue Google about which one to use.

So it was interesting to see how Google did on this translation. I found a total of 132 verbs in GallinaBlanca dictionary. Of these the following 44 had no Google translation:


Now Google can be forgiven (except it claims it’s AI does better than rule-based literal translation) for the verbs in RED since none of my dictionaries know what these are. For instance I actually think acidelar is just a typo since the definition GB gives it “Put lemon juice or vinegar in the water to cook poached eggs or vegetables, so that they do not blackened. ” is fairly similar for the known acidular whose definition is “Sprinkle with an acidic liquid fruit, vegetables or vegetables so that they retain their whiteness or colour.” But the definitions are not exactly the same and for me to declare acidelar to be a mistake is premature; after all it could be some alternate spelling or perhaps a regional difference from the standard dictionary Spainish, or, worse, it might be the spelling used in Spain versus what is used elsewhere. I simply do not have enough data to decide.

So what about something like

MOREAR (not in any dictionary) Dar vuelta sobre el fuego bajo y con un poco de aceite en un sartén o cacerola a los alimentos, para que tomen color antes de añadirle salsa o caldo. Turn over the low heat and with a little oil in a frying pan or pan to the food, so that they take color before adding sauce or broth.

This comes up blank in all dictionaries and most web searches I’ve tried. So the question is do I believe this is even a word (or perhaps it’s from some other language used in Spain). It certainly sounds like sauté (cooking technique) but that is saltear GB defines as “Stir the food in butter or hot oil when frying in an uncovered skillet.”

Now for the words not in RED I did find literal translations of them including ASAR which I find surprising that Google doesn’t know (this, as you recall, is the verb I used as example above to explain why I’m investigating verb, i.e. it is the infinitive root for asado, a very common word on menus). And I’m also surprised it didn’t know GUISAR (cook, stew; cook up) since I can recall from memory seeing that and especially its past participle guisado (refers, as a noun, to  casserole, stew, or, most generically, dish) and as an adjective as stewed. And I’ve seen rebozado (covered in batter or breadcrumbs) on numerous menus and it’s the past participle of REBOZAR (to coat in batter or breadcrumbs) that Google didn’t know. Now, OTOH, TRUFAR (try to guess before reading the translation) is probably sufficiently obscure Google may not have seen this but given the price of the item for this word you’d want to know what it means if you saw it on a mean (it means, to stuff with truffles).

Now as the other verbs which Google did have some translation I’m going through a somewhat tedious process of digging out (again, but this time in a single consistent process) the literal translations so I can compare Google to other sources. And sources are going to matter. Not only is it hard to say with absolute certainty what an appropriate translation is going to be (I believe even fluent Spanish speaking authorities might debate some verbs) I need to do this comparison of various sources in a systematic way, not believing one source over another until I can potentially “confirm” a translation via some processing of a large corpus of translated food related material, IOW, exactly what I’m building up now.

For the verbs Google did translate here are a few of the issues I’ve found thus far (not done with this analysis):

  1.  Often Google chooses the present participle as the translation instead of the infinitive, e.g. ADOBAR, Google says marinating instead of to marinate, not a big deal overall but this might get into a corpus and create a statistical flaw later in the analysis.
  2. For AVIAR Google picked the most literal, namely an adjective ‘avian’ rather than to prepare as the root verb (multiple meanings, this one matches the GB definition, “Prepare birds for cooking. It consists of all pre-elaborations that must be made to a piece: cleaning, flamed, wicking, flanged, etc.”  Note: That GB has defined this in more specific way than did and given the Latin root for both the verb and the adjective the GB definition is definitely superior (plus being more useful to understand in the context of cooking).
  3. Picking one of several literal translations, but not in the culinary sense (which I do, looking at because I know culinary is the context), e.g. BRIDAR which Google translates as ‘bridle’ (literally OK), but to tie or truss is much more useful in cooking sense.
  4. Or something like DESPLUMAR, which Google picks the present participle Fleecing, which is a plausible translation. But the GB definition is “Remove the feathers from the bird.” which comes closer to an alternate definition, ‘to pluck’. Amazingly using fleecing is a colloquial usage somewhat like English where someone is taken advantage of and thus “fleeced”.

I’m sure there will be more as I finish grinding through but this post, already TMI, hopefully gives a sense of how I’m post-processing the pure mechanical part of my study to pound the raw data into a more usable form to then create my corpus (all preliminary to creating my AI-ish smart menu translator).




Challenging menu to decode

While I’m continuing to work on the GallinaBlanca diccionario (almost done) I’m getting close to the next town on the Camino so I decided to try to work on another menu from a Logroño restaurant. It’s an interesting website as there are actually three different restaurants, with some common connection:  KABANOVA, PASIÓN POR TI, and LETRAS DE LAUREL. In addition to menus (unfortunately in PDF’s so Google Translate doesn’t work) there are numerous photos of their Especialidades (Specialties), some of which are fairly mystifying exactly what the item is.

I’ve decoded most of the three menus from Kabanova – MENÚ GASTRONÓMICO (Gourmet Menu, an 11 dish tasting menu), MENÚ PASIÓN (???, Pasión is literally passion, but really the name of the common grupo that runs these), and NUESTRO MENÚ DEL MEDIODÍA (Our midday Menu). It’s this midday menu that I’ll discuss in this post. I’ve mentioned menu del Dia before; it’s an economical way to order several courses from a limited menu. While it most often is referred to as  it is often most likely to only be offered at lunchtime (really around 14:30) and on weekdays only, so calling it mediodía actually makes a lot of sense.

Menú para 1 persona DE LUNES A VIERNES Menu for 1 person from Monday to Friday

The menu basically has these five parts:

APERITIVO de la casa. Aperitif of the House
POSTRES incluye un postre casero DESSERTS include a homemade desert
Incluye 1/3 botella Tinto Reciente DOC Rioja, agua mineral y ración de pan Includes 1/3 fresh bottle of wine DOC Rioja, mineral water and bread ration

Many restaurants will have Entrantes (instead of the Apertivo) on their menu del Dia, but judging from the pictures and other information this establishment is putting its bar forward rather than some small plate.

The first item under PRIMEROS had some fun translation to do:

Menestra fresca de verduritas de Calahorra con su velouté, crispy de alcachofa y polvo de jamón Fresh stew of Calahorra vegetables with its velouté, crispy artichoke and ham powder

I’ve mentioned that literal translation won’t work to decode many items on menus in Spain so we see a few examples of that here: 1) Calahorra doesn’t have an English translation because it’s actually the name of the second largest city in La Rioja which has as its major activity the growing and distribution of prized vegetables so using this term is emphasizing the quality and freshness of the verduritas (vegetables) used; note that  is a diminutive one of the three standard dictionary term verdura; 2) velouté doesn’t translate to English because it’s actually a French cooking term (one of the five “mother” sauces); 3) crispy is interesting since it’s already an English word and not Spanish, I guess they thought this sounded appealing, and, 4) polvo de jamón (ham powder) actually does seem to be what its literal translation implies. I found a number of receta on the Net for this and it’s just ground-up ham after drying in an oven that is used like a seasoning from a shaker. Menestra fresca itself has numerous recipes but basically it’s a stew of multiple vegetables (you can find lots of images of it on the Net).

Another item is interesting:

Ensalada de la Ribera con rulo de queso cabra. Riverside salad with goat cheese curler

Yes, de la Ribera translates to riverside which doesn’t tell you anything.  However, this link gives you a good picture and explanation of this common salad in Basque areas. The lettuce, which looks like romaine, is not and actually is a specialty in this part of Spain, often called COGOLLOS DE TUDELA (buds (really cores) of Tudela (which is a municipality of Navarre). It has is connected to de la Ribera because it is grown surroundings of the Ebro river banks.

This is a curious item:

Nuestro plato de cuchara del día Our dish of spoon of the day

A dish of spoon, sounds odd. Actually I’ve encountered this labeling before and it basically means a dish that would be eaten with a spoon, like a soup but possibly something else than sopa (soup) and so therefore labeled more generally than sopa. IOW, you have no idea, from the menu, what this will be. Since this would be one of the three choices under PRIMEROS you’d really have to discuss this item with your server or just opt for one of the other two choices or take your chances.

So, IOW, to understand and decide which of the three PRIMEROS you’d order requires knowing a lot more about food in Spain than your literal translation dictionary is going to tell you. And again, for me, the challenge is how any app for a smartphone could explain all this (or how the search would work because the exact wording would vary from menu to menu even for the same items).

I’ll wrap up with just a couple more interesting items (and this menu is this restaurant’s shortest one so lots to decode here, as you might not want just the limited choices of menu del Dia.

Bacalao confitado en aceite Arbequina sobre cama de pisto. Cod confit in Arbequina oil on ratatouille bed

Arbequina is another word that has no translation to English. That’s because it is a particular cultivar of olive, that is, you just have to know what it is.

Carrillera de ibérico 36H con manzana, zurracapote y su crujiente. Iberian cheek 36H with apple, mulled and crispy

mulled is an interesting translation of zurracapote which is actually a wine drink (similar to the more familiar Sangria). And, no clue what the 36H means?

Secreto ibérico a la brasa con salsa Teriyaki y piña caramelizada Grilled Iberian secret with Teriyaki sauce and caramelized pineapple

‘secret’ is the literal translation of secreto but what is it – sounds close to what might be called “mystery meat” here. But in fact it is a very expensive ($51/lb) cut (usually from pig and while different at different butchers usually is from the shoulder) that is similar to skirt steak. And I’ve already mentioned you would know that ibérico (Iberian) is the very prized “black” pig. Most likely, given everything else about this restaurant it probably is the de bellota type (essentially free-range and the most expensive pork) but  ibérico alone doesn’t imply that, so again you might want to ask.

There is a lot more to the menu for this restaurant and even more for other two associated restaurants so I recommend this site (scroll down enough to see the images) as one to consider.  Maybe you can figure out what Vieira gallega con sus rabas de bogavante is or GinTonic riojano is or especially oído cocina is (all have pictures in the  Especialidades. And I’m still trying to decide what I think is a lingote de cordero (either ingot or slug of lamb) – is this a cut of the lamb or a quantity indication or a preparation? I couldn’t find that and the photo doesn’t make it clear either.

BTW: This website also has a blog and this post really boosts Logroño as a foodie stop, especially (as I’ve seen mentioned elsewhere) “both in the mythical Laurel Street”. And, interestingly, while backtracking to include this in my post I discovered another blog post which tries to explain oído cocina.

Tough distinctions

In crunching through the GallinaBlanca dictionary I’ve encountered a significant number of words that seem to overlap in meaning,  or be synonyms,  or are difficult to distinguish. This is exacerbated with the issue that my main translation dictionary I use is asymmetrical (as I’ve posted before) – that is looking X to get Y as translation but then looking up Y gets Z and not X.

Sometimes my “confusion” is my short-term memory triggering me thinking I have two difference words for same thing. For instance, today I encountered:

SAZONAR (literally season) Condimentar con sal y pimienta Season with salt and pepper

Given that looks like it might be a cognate but can’t find if it is. Now ‘to season’ is already bad enough in English. In many cooking shows it narrowly means just to add salt and pepper but in other cases it is used in a broader sense. But let’s check condimentar to see about Spanish usage

CONDIMENTAR (literally season) Añadir sal, pimienta, especias, etc., a un guiso, según indicaciones de la receta. Add salt, pepper, spices, etc., to a stew, according to the recipe’s indications.

Well, that’s good because given this is the broader sense then sazonar can be used the narrower cooking show sense even though both words translate literally to ‘to season’.

BTW: This was not what I meant to discuss at this point and it shows the benefit of blogging where I do attempt to do additional research before just spouting out my gut feel about some topic (I can think of someone more important who should do this).

Anyway, here was my original point about sazonar (even though the subtle difference with condimentar was my original main point of this entire post, good, just another example to relate). Here’s the other verb I (imprecisely) remembered as meaning something similar:

SALPIMENTAR (literally season; salt and pepper) Adobar algo con sal y pimienta, para que se conserve y tenga mejor sabor. Marinate something with salt and pepper, so that it is preserved and tastes better.

Now that I’m looking at this it appears to almost be a made-up word, given salt == sal and pepper == pimienta (the spice, the fruit (e.g. bell or piquillo is pimiento (in Spain, pimentón in Latin America)) and verbs in Spanish usually end in -ar (or -ir, -er) this just looks jammed together words to make a verb. But this word is in the definitive RAE dictionary so that makes it a real word. The Spanish edition of Oxford has:

Condimentar un alimento con sal y pimienta. Season a food with salt and pepper.

BTW2: My second mistake in doing this post was that I quickly searched (due to vague memory of similar term) and found salpimentar but actually thought it was salmuera which is doubly wrong (since that is a noun) and the verb, therefore, is a phrase (either) [ponerembeber] en salmuera for ‘to brine’. So I really went around in a circle here – my original vague notion was entirely wrong but I ended up, serendipitously, actually making the point of the article.

But, briefly, this was my main point with these examples of related words where literal translation doesn’t help much (or at all) to distinguish: a) hongo, seta and champiñón, and b) rabassepia, jibia and calamar. And these word sets also illustrate the need for multiple sources since there is some disagreement between sources and then evaluation. Or perhaps usage will also be different in different regions or by the heritage of the people using the words – oh joy.

So first, what is the word for ‘mushroom’ is Spain? After quite a bit of searches my conclusion (quite possibly wrong) is both seta and champiñónhongo is used in Latin America as mushroom (in culinary sense) hongo would be used more in the scientific (botany) sense as just fungus. Now in case you don’t know mushrooms are the fruiting cap of fungus; IOW, most of what you don’t see is the fungus growing underground and then pushing through the surface to produce its spores (to spread itself further) via the cap, which is the part we eat. So there is a fair amount of confusion here that calls for precision to disambiguate and I wouldn’t expect that in most menus (the authors are chefs not scientists, or nit-picky programmers like me).  So then it also appears to be that the difference between seta and champiñón is: seta are a flat-topped mushroom (maybe chanterelle, oyster, even shiitake); whereas champiñón are a round-topped mushroom (like common button mushroom or cremini, even portobello). There seems to be a further connotation (at least in some sources) that seta would be wild and champiñón are cultivated.

Now, so what? If you’re a bit of a foodie you’d have preferences for what type of mushroom you’d use how and also how you’d prepare it and so forth. And you’d probably know that most “wild” mushrooms are often dried and rehydrated vs simple button mushrooms are probably fresh AND wild mushrooms are a lot more expensive and also more flavorful (to the point some people don’t like them very much, preferring the blander button mushrooms, but in certain recipes bland is good). So you’re looking at a menu and going to pay out some serious € you’d want to know what you’re getting.

Now trying to distinguish rabassepia, jibia and calamar ran into a variety of problems. These, for English, might all be grouped under ‘squid’. With living things there is often the problem that layman have a “common” (and often misleading) name whereas the scientists are more precise (but then rarely used) and have their taxonomic names. But in addition the method of preparation of these food items may influence the names as well (i.e. is calamari a dish (made from various squid species) or a specific animal (in the scientific species sense).

rabas were amusing to me as they translate literally to ‘bait’. The one time I ever went fishing on the ocean (on a charter boat) we bought frozen packages of small “squid” (my notion of what a squid is) to use as bait. But in Spain these are a prized delicacy. But Oxford defines them as this:

tentacle of a squid or other cephalopod, prepared fried as an appetizer

So any old cephalopod with tentacles will do?

I can’t find much for jibia as does seem to be an equivalent synonym for sepia, both of which are translations for ‘cuttlefish’ (not ‘squid’ which translates to calamar). Trying to track down the difference was, for my searches, inconclusive. Some sources imply calamar is far superior than sepia, thus deserving a higher price. Other sources believe very small sepia are best. The closest it seems, relative to scientific sense, is that sepia are cuttlefish which include critters that are commonly called ‘squid’; IOW all squid are cuttlefish but not all cuttlefish are squid. In the scientific articles various anatomical differences were explained but it was less clear in the culinary sense.

Again, what does it matter? Well, some cuttlefish may make a better calarmari than others, plus some are big and some are little, so how much dinero (I’ll assume you know that as a loanword to English, otherwise it’s ‘money’) matters as well. Seafood is a particularly tricky food to buy as often substitutions are made of lesser animals for the more prized ones. Often even the fishmonger can’t tell the difference but when it comes to eating them (and paying for them) you should get what you expect. Now, OTOH, calamari (especially with some piquant red sauce) are probably hard to tell apart.

BTW: sepia does also literally translate to the color and there is a chance that would occur on a menu. AND, squid ink is la tinta natural del calamar or just tinta.

sidenote: One thing that has always confused me (speaking of ) is that red wine is almost always referred to as vino tinto even though ‘red’  is rojo, given that white wine is almost always vino blanco thus using white == blanco. I guess you just take it as it comes since this is a distinction one would quickly learn. Weird, in a little research for this sub-point tinta is a noun (feminine, -a) just for ink whereas tinto is both noun (then for wine) and adjective (‘dyed’ or ‘stained’, but then for a dyed feminine noun we’d be back to tinta – oh, joy). Fortunately I doubt there would ever be a problem with this.

p.s. (added after initial post). If you think some of these were close I just hit simiente which is almost totally a synonym for semilla (both are seed), except, apparently (just a single anecdotal source) simiente also means semen. Very tiny distinction, both are used in Spain, so either might appear on menu, although it is another word for ‘seed’ that is far more likely, pepita, which in some parts of U.S. would be known directly, although most likely as roasted pumpkin seeds, not seeds in general. Funny coincidence I’d find another example minutes after publishing my post.