Unplanned post of menu translation

Instead of my planned post I’ve digressed into analyzing the menu of restaurant in San Sebastián Spain, recommended by a loyal reader, Gandarias.

I’ve been working (offline) on a series of posts comparing my experience of now nearly 500 days of learning Spanish language with my original approach of analyzing menus from Spain and deducing menu vocabulary. My purpose has been to first find source material and translate it, create a corpus of translated material, extract from that corpus “translations” (not word-by-word, but more meaningful translations) and then create a smartphone app to contain all the deduced vocabulary and food/cooking terminology for a person trying to read menus in Spain.

I had originally planned to find source material and create a corpus without learning Spanish. I felt I could accomplish my purpose without language fluency. But somehow I got convinced to learn Spanish (I’m not good at languages so this is quite a challenge for me) and so for the past year I’ve had few posts about menus and interesting items I was finding. Just having a Spanish dictionary is not very helpful for figuring out what items on a menu happen to be.

So before posting some more on this general topic I had planned to show some menu items to just present some examples of some of the issues. I’d picked a restaurant, more or less at random, in Leon and had some examples ready to go. Instead circumstances provided me a different opportunity. While reading a post of another travel blogger about San Sebastián I decided to take a hint. While I can’t actually go to the restaurant, as recommended, I did find it had a good website that also resulted in an unexpected adventure.

On most of my previous analysis of menus I have not had a human English translation, partly because I was looking at small restaurants along the Camino de Santiago. So for my initial analysis I’m dependent on Google Translate, which often botches menus as I’ve pointed out in previous posts, plus then other investigation to figure out items.

In a few of the larger cities restaurants sometimes do have English translation and this provides some extra calibration. When one is trying to build a corpus it is inevitable some errors creep in, but the quality of the final consensus view of translating menu items is enhanced by having as much raw material as possible, so human English translations really supplement the guesses, I and Google, are making in our translations.

So Restaurante Gandarias has both Spanish and English, as well as Euskara, the Basque language given this restaurant is in the heart of Basque Country. It is also a very popular resort and thus likely to attract many clients who will appreciate the English version. And even in the Spanish menu some items still use the Euskara terms.

Now a note about “menu”. In most restaurants that’s what a diner gets, but in Spain it is common that there are designed menú, that is several courses chosen by the restaurant and combined as a single order, also as prix fixe to use the French term. The “menu” I had originally planned to use for this post is in that category. OTOH, some restaurants (and their websites) also provide a carta, which Google translates as ‘letter’ which is nominally correct and totally correct in other circumstances (ahora escribo una carta, see I’ve learned something, did that from memory) and it can also mean card, as in cartas de juego (playing cards, as opposed to tarjeta de credito for credit card; also fun when there are so many meanings for words, both to and from Spanish). But for this restaurant carta has the meaning, from the French and sometimes found in USA, a la carte. Or basically individual items ordered separately at the diner’s choice.

For the Gandarias carta it’s divided into sections: Todas (all), Ensaladas (salads), Entrantes (starters), Pascados (fish), Carnes (meat) and Postre (desert) – and yes, I’ve had all but Entrantes in my Spanish lessons. So I selected Todas (in Spanish version) and got four webpages of pictures of food with captions as to the item. Fine, I scooped up all four pages, did some fiddling to reformat and created the first column of my typical table I use for analysis. Knowing there was English I wanted to get the Google Translate first so I did that and lined up items in a second column of those (all this will be at the end of the post).

Then in what I expected would be a routine mechanical process I switched to the English version of the website.  Since Ensalada de bogavante was the first item I didn’t even need the picture to realize that Roasted baby lamb was not the same thing. A bit more poking around and I realized while it appeared the English and Spanish menu had the same items they were in totally different orders.

AH. A challenge. Now I have to take the English description of the item and find the corresponding Spanish. Now for this item,  Lettuce and onion salad I was able to pick   Ensalada de lechuga y cebolla even without looking at the Google Translate with is exactly the same, easy-peasy.

But it wasn’t all so easy; for instance Scrambled eggs with cod matches with Revuelto de bacalao, not just because one easily remembers bacalao is cod (about as common a food term as there is in Spain, even obvious from bacalhau where I actually had it multiple times in Portugal).  But also because while  Revuelto has dictionary translations: messy, upside down, mixed up, disheveled,  untidy, nauseous, cloudy, turbulent (and more), but most usefully scrambled. I have dug through enough menus in Spain to known that scrambled (and implied to be of eggs) fits, hence scrambled eggs with cod (even though huevo is missing in the Spanish). Amusingly Google doesn’t get the implied eggs and therefore thinks it’s the cod that got scrambled so it says: Scrambled cod so if you were using your phone do you think you’d order this.

Now a few stumped me a bit more than others, but like one of those games where you match up things in columns I only had a few left and thus got my clue:  Grilled magret was the human English translation. ¿Qué?  Magret stumped my usual translation sources and Google had missed it, but in a Spanish dictionary (with Spanish definitions of Spanish words, not translation) I did find:

Filete de pechuga de pato o de ganso muy utilizado en la cocina francesa.

which I can almost translate myself but here’s the GT

Duck breast or goose fillet widely used in French cuisine.

So, in other words, it isn’t a Spanish word, but the key hint (as well, a bit, the picture) is pato, so I was able to match up with Magret de pato (I never just did searches in my text, instead trying to translate myself).

So I wanted to do a couple of more to finish my point, about some challenges of translating menus (which, btw, are NOT solved by just learning to speak Spanish):

Almejas a la marinera Clams a la marinera Fisherman´s style clams

So it helps to know, a la marinera, which one would more typically associate with Italian food, is a particular style, really, just a typical tomato sauce, EXCEPT, typically in Spain and with clams it is NOT a tomato sauce – fooled yah. Yep, the human translation of Fisherman style is real helpful, might be useful in San Francisco.

Arroz con leche casero Rice with homemade Milk Rice pudding

Google is just too literal, arroz con leche is just rice pudding so the homemade (a valid translation of casero) just applies to the desert, not the milk,

Besugo a la plancha Grilled sea bream Grilled sea bream
Bogavante a la plancha Grilled lobster Fresh lobster grilled

Both of these provide a little fun as to exactly what a la plancha means. Yes, it does, more or less means, grilled, but then think about what a la parilla means (also grilled).  Usually a la plancha (literally on a plate, or in Italian, on the iron) means just cooked on a hot steel plate, cast iron pan or ‘flattop” in a diner.  a la parilla usually means a grate over some kind of open heat, either just gas or it can be wood (a la brasa). Now being fairly good with a grill myself these are quite different and I’d want to know which it really was. Which therefore brings up another point – reading a menu is not enough so being able to speak to your waiter (if knowledgeable) or even the chef may be required to really figure out if this is the dish you want. And therefore, that’s a different reason to actually learn to speak Spanish.

Chipirones a la plancha Grilled squid Grilled squids

chipirones can be interesting because it’s only one of the words for squid, but in this case it means baby (small) squid and frequently, in Spain, battered and fried squid, or as we’d order in USA as fried calamari. BUT, in this restaurant, given the picture, that’s not what this dish is.

Now: A brief side personal digression. For a couple of years I made multiple business trips to Japan. Learning Japanese was not going to happen but worst trying to learn the written is hard. My job required me to learn how Japanese is written (not the 1945 standard Kanji, just the algorithms of typography). At the time most Japanese restaurants had displays of plastic food (rarely picture menus) with little labels in Kanji. I quickly learned, while I had no clue what the Kanji meant, how to copy them into a little notebook and chose my item from the plastic food and then show the Kanji to the waiter. It worked fine and I always got what I expected. But I have no idea if the actual menu in this restaurant (unlike the website) would have the really dumbed-down version to show the pictures.

Now a few interesting ones that being fairly fluent in Spanish or knowing much about Spanish food won’t help so much, plus these stumped Google a bit.

Changurro al horno Baked Changurro Baked spider crab

You see Google didn’t know changurro. BUT, remember we’re in Basque country, so a bit more searching is that this word is really txangurro, where the tx, even just the x is a giveaway this is the Basque word and thus the Spanish spelling of it.

Kokotxas de bacalao al Pil-Pil con almejas Cod Kokotxas al Pil-Pil with clams Cod cheeks in pil-pil sauce with clams

The unusual spelling of kokotxas is another giveaway this is the Basque word, literally, cheeks, and really one needs to know this is a particular dish unique to Basque cooking to really have a clue what this means.

And

Pantxineta Pantxineta Pantxineta

I think you get this, obviously Basque, dessert where this is as good a description as any.

Rodaballo con su refrito ligado Turbot with its tied rehash Turbot with its thickened sauté

An amusing Google translation.

Tarta “Gandarias” elaborada por Rafa Gorrotxategi “Gandarias” cake made by Rafa Gorrotxategi Pastry chef Rafa Gorrotxategi´s “Gandarias” cheesecake

Totally meaningless terms, in any language. Even the generic Spanish tarta is ambivalent exactly what this might be.

Solomillo de vaca vieja con foie al Oporto Old beef sirloin with foie gras in Porto Old cow sirloin with foie in Oporto style

So here are a couple of interesting terms that just don’t translate (at least from Spanish): foie (the French word for liver, most foodies would just know this as language independent) and Oporto (second biggest city in Spain so probably most travelers would recognize it, but is it Port or OPorto (clue, in some language O is the)). And what style is that? If I was telling you about BBQ and said “Texas” style would you know that’s brisket withOUT sauce?

Tabla de ibéricos de bellota «Joselito» Table of Iberico de bellota «Joselito» Mixed iberian “Joselito”

I’ve mentioned Iberico de bellota in many posts before and if you go to Spain you’d better know what this means as you’ll pay a seriously premium price to get some slices of ham.

Personal Note: Here in flyover Nebraska there is actually a farmer who grows very similar pigs and lets them roam, yes, among oak trees and eat some acorns. AND, there is a gourmet butcher in Fort Calhoun, CURE (just there yesterday) who makes very similar (air dried, no smoke or salt) hams from those pigs, and, yes for a really serious price. I may never had had Spanish Lomo but it’s delicious from CURE.

Callos calluses Tripes

I had to include this one because, well, one reason I want to know about menus in Spanish is there are things I choose not to eat and this is one of them. Given Google can’t translate it, I’m glad I’ve got this in my lexicon.

And just for fun

Coulant de chocolate Chocolate coulant Chocolate fondant

chocolate is the literal word in Spanish for the same word in English (and nearly the same in French) BUT it doesn’t belong to any of these languages since it’s really xocolātl, so even Spanish has plenty of loanwords. But what about coulant, which is really a French word, meaning flowing, but interesting fondant in Spanish but that’s just another French word. And there is no English word, so if you don’t know what this is, there is no point in trying to translate.

So after a long post, you’re probably ready for dessert, so how about

Crema de yogur con mango crujiente y sirope de fresa Yogurt cream with crispy mango and strawberry syrup Yoghurt cream with crispy mango and strawberry syrup

Looking at the words on menus only reveals a bit about dining. Knowing a bit more about cooking, in general and Spanish in general, helps a lot. But if a person only had one chance to go to this restaurant and wanted to get the most interesting items some discussion with, hopefully, knowledgeable, waiter is essential.

So one conclusion from all this is that the basic idea of my project, translating food, is fundamentally a failure. One can translate words, or even combinations or words, and still have little idea what a menu item is.

Translation, as it is said in math, is a necessary, but not sufficient condition.

¡Volví! ¿me extrañaste? Ha sido un tiempo.

Si, puedo tutearse ya que nosotros son amigos. Or IOW, I can address you, Dear Reader, as since we’re friends here. And to my new friends, who may read this blog for the first time I’m old and thus more likely senior to you and so I don’t have to use the formal ustedes.

I haven’t written any posts about Spanish to use for food and restaurants as is the plan for this blog since I’ve been very busy. I haven’t lost interest and intend to continue more exciting posts about interesting Spanish terminology you’ll find on menus in Spain (and, mostly, for other Spanish speaking countries).

When I started finding and decoding menus along the Camino de Santiago in Spain I didn’t know any Spanish. I thought I could still figure out the Spanish on menus by associating what I find on menus with either human or automated translations, plus a lot of searching for more obscure (non dictionary) terms. Several people insisted I’d need to learn Spanish in order to do this, but, initially, I dismissed that suggestion.

I didn’t try to learn Spanish because I had tried in the past with little success, using the conventional learning materials. But, fortunately, there are new tools today. So I’m now on my 352nd day of using Duolingo to actually try to learn the language. Duolingo is great and I’m about half way through its Spanish course. But at the same time I found I needed to do other things and fortunately there are lots of other sources to use for study.

So I’ve done about 64,000 individual drills in Duolingo and so have picked up over 3000 words. I can (just barely) get through the A1 CERF tests. I’ve also “read” about 50 beginner stories, plus even tried some literature (way beyond even A2 level, but interesting to try). I’ve “read” (with lots of help from dictionary since the vocabulary is more extensive than Duolingo) lots of recipes (recetas) and descriptive text at numerous restaurant websites in Spain. So I get a lot of practice reading.

But I don’t get any practice speaking (no partner/tutor/teacher for that) and not much practice listening (Duo’s audio is easier than real speaking), but I try to follow numerous TV programs or even specialized programs, like the wonder La Casa de las Flores on Netflix. When I started all spoken Spanish was just a blur of sound to me, but now I can catch a little bit. I still don’t have enough vocabulary to recognize enough of the words to detect word boundaries, which really (to my ear) blur together in spoken Spanish.

So while I have another year to study ahead, to finish the Duolingo course and probably get near the A2 level and then also maybe have 5000 word vocabulary I’ve learned enough that it’s much easier for me to read restaurant websites. I’ve had lots of opportunity to see what the automated translation does right and wrong and so I can use both my knowledge, the automated translations and additional analysis to get most of the content.

Thus I should be able to do even better posts. Even though I didn’t know the language, before, I did figure out enough, IMHO, to find and describe some interesting things about Spanish menus, so now I expect to do even better.

Also, in previous posts I described my “virtual” trek on the Camino. Simply, to encourage myself to do exercise on my treadmill, I converted my exercise mileage along a GPS track to find my location on Google Maps and then use their overhead views, photos, the StreetView (when available) and other geotagged sites to “explore” the Camino. And as I previously posted I eventually did the entire distance, 796.4km (about 500 miles) to  Santiago de Compostela.

So, after getting there, I needed a new “virtual” trek goal so as I previously posted I started the French part of the Camino, starting a Le Puy en Velay and I’ve now reached Conques, 125 miles. While “walking” the Spanish part I “stopped” at every restaurant and hotel/albergue to look at all the photos, mostly of food or menus. I could do the same thing in France (and sometimes do) but information about that route is less plentiful and what I find on Google Maps is both French language and French food, which is wonderful (I did have some French in school), but not my goal. So that virtual trek has not been as engaging to me and thus I haven’t done any posts about it (and probably won’t).

Meanwhile I really want to turn all this purely vicarious activity into something real so I continue to look at two things: a) some Spanish speaking country to visit, not just as tourist, but really trying to get to know, and now my focus is on Ecuador, but probably only after some of the political unrest there settles down, and, b) trying to do one of the immersive language study programs in a Spanish speaking country (some excellent sources of these things can be found online).

So I have lots to keep me busy and thus I won’t have time for as many posts as I was originally doing, but now I’ll try to find something, still focused on food, to discuss.

One of my next projects will be this:

abarquillar abrillantar abrir acabar acanalar acaramelar aceitar aceptar achicharrar acidular acitronar aderezar adobar agregar ahumar albardar alcanzar aliñar almibarar almorzar amar amasar añadir andar anisar apagar aparecer aplanar aplastar aprender aromatizar asar asustar atar aviar ayudar bañar bardar batir beber blanquear brasear bridar buscar caer calentar cambiar capear caramelizar cascar catar cenar cepillar cernir chafar chamuscar chorrear cincelar clarificar cocer cocinar colar combinar comenzar comer comprar comprender condimentar conducir confitar congelar conocer conseguir conservar considerar contar convertir correr cortar crear creer cuajar cubrir cumplir dar deber decantar decidir decir decorar degustar dejar derramar derretir desalar desayunar desbabar desbardar desbridar descamar descansar descongelar descubrir desengrasar desglasar desgranar desgrasar deshuesador deshuesar desleír desmoldar desnatar desplumar desvenar dirigir disfrutar doblar dorar dormir echar emborrachar embridar empanar empanizar empezar emplatar emulsionar encender encontrar endulzar enfriar engrasar enharinar entender entrar envolver escabechar escaldar escalfar escamar escribir escuchar escurrir especiar esperar espesar espolvorear espumar estar estirar estofar estudiar evaporar existir explicar exprimir fermentar filetear flambear flamear formar forrar freír frotar fundir ganar glasear gratinar guisar gustar haber hablar hacer helar hervir hornear humear humedecer imaginar incorporar instilar intentar introducir ir jugar laminar lavar leer levantar levar ligar limpiar llamar llegar llenar llevar lograr machacar majar mantener marear marinar masticar mechar medir mezclar mirar mojar moldear moler mondar montar morir mover nacer napar necesitar nevar ocurrir ofrecer oír oler pagar paño parecer partir pasar pasteurizar pedir pelar pensar perder perfumar permitir picar pinchar pochar poder poner precalentar preguntar preparar presentar probar producir quedar quemar querer quitar rallar realizar rebanar rebozar recalentar recibir recomendar reconocer recordar reducir regar regresar rehogar rellenar remojar remover repetir reservar restregar resultar revisar revolver rociar romper rostir saber sabor sacar salar salir salpicar salpimentar saltear sancochar sazonar secar seguir sellar sentir ser servir soasar socarrar sofreír subir sumergir suponer tajar tamizar tapar tener terminar tocar tomar tostar trabajar traducir traer transferir tratar trinchar triturar trocear trufar untar usar utilizar vaciar vaporar vaporear vaporizar venir ver verter viajar vivir voltear volver

Yes, that’s a massive list of the verbs I’ve found in over twenty different sources that relate to cooking or dining. Finding, extracting, cleaning up, merging and then getting “consensus” translations is tedious work but I’m chugging through this list (far bigger than any single list I found anywhere online) and will surely have some material for posts and probably another page (like my glossary) to provide what I think will be the most comprehensive online list. The one I marked with bold are the ones I now just know from my Duolingo study, not bad for an old dog who knew zero Spanish a year ago. But this also shows how little food/cooking/restaurant information is available in standard Spanish courses and how much more there is to learn.

By the way here are some verbos of interest:

desayunar to eat breakfast (el desayuno)
almorzar to eat lunch (el almuerzo)
cenar to eat dinner (la cena)
comer to eat
beber/tomar to drink

So plenty to do and hopefully more interesting posts to follow.

So

Vamos a caminar y comer.

and

¡buen provecho!

 

Reading menus in Spain

scroll down to the bottom of this post to see Spanish terms for food allergens.

I started this blog to document work I was doing to collect a large corpus of Spanish terms found on menus (focused on Spain, not Latin America) and from that develop an application to aid in reading menus. You might think this already exists with one of the AI translation systems but those make many mistakes with food.

Anyway that was over a year ago and I’ve gotten side-tracked on various things. It was suggested I should just learn Spanish but I always felt that was too difficult (I’d tried unsuccessfully before) and also menu terms are more specific than more generic Spanish classes. My notion, as a software type, is my application is simply a question of manipulating symbols. Sure reading literature or poetry does required knowing the language and very well at that, but cooking and cuisine and food are a specialized vocabulary with minimal need for understanding grammar or conjugation or what is usually taught in language classes.

Well, in the end I gave in. It turns out reading a menu is one thing, actually being able to ask questions (preguntas) about it and understand the answer is another. My early research demonstrated that what is written on menus, often, is inadequate to actually know what dish you’re getting, what’s in it and how it’s prepared.

So 186 consecutive days later I have been learning Spanish from a very good online site, Duolingo. According to them I’m up to 1526 lexemes (about 1/3rd through their course). But while that’s been very helpful: a) that course doesn’t have much about food or cooking (I have phrases for how to order though and two words for waiter, camarero and mesero and why sometimes it should be an ‘a’ instead of ‘o’ at the end), and, b) even just for reading (like restaurants often have prose descriptions of themselves and their culinary approach on the menu) is not entirely aided by the types of drills common to language learning programs.

IOW, it has helped and is helping, but it’s not enough. So, in fact, my original notion is still fairly valid, focus on menus and how to read them.

Now in order to find menus I do this silly thing of converting miles I put in on a treadmill in the basement to a GPS track of the Camino de Santiago. Then using Google Maps I’ve explored all sorts of restaurants along the Camino. Now most are simple mom-and-pops with fairly limited menu but every now and then you get to a large city where the cuisine can be considerably more sophisticated. And as I mentioned in a recent post I’ve “reached” Santiago de Compostela which attracts lots of tourists and partly as a consequence has 571 restaurants at just one rating site. IOW, lots of rough material to study.

In addition, with help of some Spanish (Spain) cookbooks, lots of exploring menus, that in additional to cuisine in Spain having many regional variations there are also regional languages to deal with. When you start the Camino you see a lot of terms from the Basque language and when you end in Galicia you see Galego which I learned is more related to Portuguese than Castilian. Since I’m casually exploring Portuguese at Duolingo one quickly learns why A and O appear so often in Galicia, being the equivalent of the la and el the’s of Spainish.

So I’m now digging through menus in Santiago and expect to have a number of posts from that work. But just to put a little meat in this post I’ll describe one interesting thing I just saw. The restaurant O Curro da Parra is my first menu I’ll describe but I wanted to discuss this bit. For example we see an item:

Helado de tarta de Santiago, cremoso de chocolate y bizcocho cítrico6

(A: leche, huevo, gluten, frutos secos)

At first I thought the bit in parenthesis was ingredient but then realized (not explained on website) the A: probably stands for alérgeno (allergen) or alergia (allergy). Isn’t that nice of them to provide information, about the dish, for people with food allergies or sensitivities. So I’ve collected this list from the entire menu:

apio celery
crustáceos crustaceans
frutos de cáscara fruit peels 
gluten gluten
huevo egg
leche milk
moluscos mollusks
mostaza mustard
pescado fish
sésamo sesame
soja soy 
sulfitos sulfites
frutos secos nuts

Now most of these are straightforward but there are a couple of mysteries. First is soia which the restaurants website translates as ‘soy’. But that doesn’t match anything I find in references since soy is usually soja (in Spain) and soya (in Latin American) so I assume that’s some regional spelling difference (and Google Translate thinks it’s ‘soy’).  And frutos de cáscara continues to be a mystery. It’s mentioned for a dessert and translated at the website as ‘nuts’, but the websites also lists another item frutos secos  which is the more common translation of ‘nuts’.  cáscara by itself is ‘rind’ or ‘shell’ so my guess is this is actually a reference to ‘peel’ of a fruit (and probably lime since that is included in the name of the dessert). So even with dictionaries and AI translations and even human translations you might still not be able to figure these out exactly and if you do have allergies you probably need to know for certain, so hablo con el cameraro.

More coming, stay tuned.

 

Next virtual trek – my plan didn’t work out

I know this sequence of posts is way off the primary topic of this blog but this will be the last one (on this topic, at least for a while).

When I last left you hanging I described the method I was going to use to acquire an accurate table of distances, fairly closely space (e.g. 3-6km) along the Via Podiensis so I could spend the next year or so on treadmill piling up miles to then “take” a virtual trek. My plan was to use a couple of GPS tracks I found online to get an accurate distance along the entire trail and then pick intermediate spots for my table and know their distances.

Since the software I have on my PC only covers the USA my only available tool (at least in initial plan) was Google Maps (or later tried Google Earth which has more features).  I quickly learned two things: 1) the high resolutions (4000 waypoints) GPS track was very tedious to enter (all manually) into Google Directions which has a limit of 10 points along a route and thus I was getting less than 1km of trail for 5 minutes or so of work, 2) every now and then, but in minor ways Google didn’t want to generate precisely the same route as I could see on the map where I could display the entire track (but not get any distances).

So I switched to the lower resolution track (only 500 points, visually on the maps it’s a bunch of line segments that don’t precisely follow the road/street/path/trail). But I figured I could find the flaws in that and patch in bits of the high resolution data.

Now in some ways I’m really being OCDish about this. What difference does it make to be highly accurate. Well, consider this, a real walk has to go where the path goes, not in straight lines across country or through someone’s house or yard. And most of the backroads where the Camino goes are not straight super highways but meandering paths. Now if you’ve ever hiked in the real world you know your actual path can be a lot longer than just a compass line on a maps. All those zigs and zags add up. The small set of straight line segments would probably be off, in total distance, by hundreds of kilometers. IOW, not much use for accurately converting treadmill miles to a location on the ground in France.

But not to worry, Google knows this and so it actually follows the road between two points on the road. And while it does a bit of rounding in the distance that’s still going to be fairly accurate.

So other than being a tedious process my preliminary results showed, at the cost of more time than I’d hoped, I could get a fairly accurate route.

WRONG!

I was manually entered a set of points, having worked out a record keeping procedure for doing all this and everything was fine and, then, the next point, probably only 50m from the previous with a road showing in map mode and even clearer in satellite photo mode and Google routes this round-about path, about a kilometer that was essentially a giant U-turn to reach that point from the other direction!

No sometimes, at least here doing geodashing in the midwest, that’s exactly what one has to do. Yes there is a road on the map and yes you can see it in the satellite photos and NO you can’t go that way because there is a gate or a damaged bridge or whatever. But presumably the GPS track I’m using means that person who recorded the track DID go that way so it’s possible.

After more experimenting I eventually discovered that what I’m seeing is gaps in the Google underlying database, i.e. some abstracted mathematical description of all the possible roads/paths/trails they know. And in that database you can’t get from point A to point B, at least not just going forward.

So after reading manuals and searching online I eventually discovered (I think) there is no way to solve this. So electronic mapping systems let you manually enter “vias”, i.e. some line segment that connects two bits of road together. That software is letting you use your knowledge (you can go that way) to override their database that can’t allow you to go that way.

But Google isn’t designed for complex routing issues. It’s designed for ordinary users to do simple things and thus doesn’t clutter up its UI with all sorts of advanced features. I encountered this with my standard USA mapping application (now defunct as the company was bought out and their products dropped; I won’t mention the name). That program was for “pros”, people who had complex navigation problems. For a while it was the only car-based solution but gradually the dashboard GPS came out and also, of course, Google Maps on smartphones. Those solutions are generally much easier to use, but they are “dumbed-down” relative to people with complex navigation requirements, which of course is a very tiny fraction of the market that they can afford to ignore.

So after searching for other solutions (there are a few other online mapping systems, but most have even less data than Google) it appears, like my route on the map, I just can’t get there.

As someone so often says, “SAD”.

So that means I have to use the one other data source I have which has two problems: 1) the distances between the 34 overnight stops are rounded off and add up to about 50km less than the known distance of the route (which, often, there are multiple answers to that to be found, but all the distances are greater), and, 2) there are just the 34 waypoints which will takes weeks for me to reach each (yes, the trekkers do them in a day, but I couldn’t imagine doing 20 miles / 6 hours on the treadmill in a day).

Plus my purpose in all this is a “virtual” trek. I did learn that Google has lots of detailed data at short distance intervals, restaurants, hotels, gîtes (the French equivalent of alburgues) and other points of interest. So I need all that detail to “see” what the trek would look like. It turns out that only doing relatively short daily distances on treadmill allowed me to follow (where available) the entire streetview (so literally walk into a town and look around). I have lots of experience looking at satellite photos (though mostly in plains and midwest US which doesn’t look much like France, or even Spain) but online satphotos aren’t the high resolution spy photos so often you can’t “see” very much. And looking at the roof of a house or building is much less interesting than looking at it at ground level.

So while I can use the table I did find, just for statistical purposes, I’m going to have to really guess (from zooming in on GPS track displayed in Google Earth, unless I can figure out how to load KML files into Google Maps) where I am. It’s not going to be pretty and that’s a bummer that make take too much “fun” out of my virtual trek to just bother.

At least one thing, though, is I can take a look at some French restaurants and while I’m not interesting in trying to build a translation app for that at least I can see lots of pretty pictures of food (already seen some, first course in France seems to routinely be pâté not cured meats as in Spain).

So with all this discussion out of the way I can get back to my regular topic, menus in Spain, since Santiago has a ton of restaurants, some with online menus I can decode.

Next virtual trek

I mentioned in yesterday’s post that I had completed my virtual trek of the Camino de Santiago. That is, I take mileage I accumulate on my treadmill in the basement and convert it to locations along the Camino. Google Maps and Streetviews then provide a good “look” at the route.

Why do I do this? First, I want to actually learn as much as I can about walking the Camino and my relatively low daily distances on the treadmill are easy to follow on Google Maps, also allowing me to find restaurants and albergues along the Camino and study their photos and menus to learn more about food, or generally something about what walking the Camino would be like. Second, using a treadmill is boring so I need some sort of incentive – knowing I’m just a short distance, along the route of my virtual trek, to a particular POI (Point of Interest) on a map gives me motivation to do a bit more on the treadmill.

So now that I’ve “finished” the Camino what do I do?

Now I put “finished” in quotes because the data I have for the Camino’s route (and thus distances along the route) is somewhat uncertain. I found a Google Earth GPS track of the Camino and used that for while, but whoever set that up didn’t renew their Google license (for embedded maps in webpages) and it failed. So I found another route. And guess what, they’re not the same.

There’s an old joke that a man who has just one watch “knows” what time it is, but a man with two watches isn’t sure, i.e. different sources of data almost always disagree. Also, until my latest exercise I didn’t try to get distances along the Camino directly from the GPS data but instead from a table I found on the Net. I did enough analysis to confirm that table seemed relatively accurate and so used that data to declare I had “finished” the Camino.

But two new items for me. While I had learned that “Camino” itself is a vague term (there are many routes of the Camino) I didn’t realize that the Camino Frances (the most popular route) doesn’t actually start in Saint-Jean-Pied-de-Port; that’s just the most popular starting point resulting in about an 800km walk. Instead that particular Camino really can start various places in France, but most commonly in Le Puy-en-Velay France (and then that segment goes by the name, Via Podiensis). Adding that segment (and also going past Santiago to Fisterra) turns the walk into a 1000 mile trek, not just the <500 miles of the conventional route.

So now I have an obvious extension to the Camino to use as my new virtual trek, the entire 1000 mile distance which will give me something to do on my treadmill for another year. So that gives me a new project, figure out the distances along the Via Podiensis. Right away (and I’ll describe this in more detail in a followon post) I found several GPS tracks but all of those have some “issues” as to figuring out distances and milestone waypoints. I also found, at a website that does escorted walks, a table of distances between the 34 overnight stops they make. But that route is: a) not exactly the detailed route of the Via Podiensis, and, b) the distances are round numbers whose sum of all the segments is about 80km less than various sources claim is the total distance.

Now people actually walking the Via Podiensis could care less about all this; they’ll find the route (possibly with some misdirection) and get to their destination. But I need as accurate as I can create route and table of distances to do my conversion from miles on the treadmill to locations in France.  And so that’s what I’m working on now and will report in a short while.

Fortunately I have plotted about the first 40km and as I’m now only (on cumulative treadmill distances) about 2km past Santiago I can restart my virtual trek for at least a couple of weeks while I figure the rest out from the multiple sources I have (and perhaps even more I might find).

Now how do I do this?

I have a long history with GPS and GPS tracks and I’ll bore you, Dear Reader (and record for myself) some details.

I first learned about GPS when I was working at a small startup in Silicon Valley and one of the engineers was recruited to go work at a new startup, Trimble. I’d never heard of this (or GPS) but learned an ex-HPer, named Trimble, had started the company and was recruiting colleagues he’d known at HP (now in the diaspora of former HP employees populating all the other startups). At that time GPS was a military technology and had a hugely expensive system (in nuclear submarines) but Trimble believed this could be re-engineered for a consumer (albeit only professionals) technology. Later, in another company I used to ride my bike to work and I often noticed people with huge backpacks and an attached 6′ long stick with electronics  on top. I didn’t know it at the time but these engineers were testing the early Trimble prototypes.

So fast forward about a decade and when I first moved to Nebraska I was going crazy in the winters (having been spoiled by California) and so just set out driving south, eventually ending up in Big Bend National Park. Driving solo and trying to read a paper map was nearly impossible so I was in the market for a better alternative. A bit of research revealed that GPSr (the ‘r’ is for ‘receiver’) had truly been reduced to consumer (affordable) level and so I bought my first laptop and the DeLorme GPSr and its software. The world of automated navigation was opened to me.

While the laptop worked fine in the car (I had to also discover “inverters”, then uncommon to power the laptop) but was useless for walking. That led me to discover handheld GPSr’s, in particular the early Garmin eTrek models which I bought at the original Cabellas (in Sydney Nebraska) and used for the first time hiking in the Bighorn Mountains in Wyoming, learning an important first lesson, use the GPSr to record the location of your car so you can get back to it.

All this led me to the world of geodashing, one of the various geo-xxx “sports” in the earliest days of consumer GPS where they were still rare and so enthusiasts would find a way to make a game of using a GPS. Over time I learned more about mapping and especially the early satphotos to use to study a place one might go, where despite roads being shown on the electronic maps (the data was crummy back then) might not really exist. Over the years I got better and better at using these tools, which eventually led me to my first “virtual” trek.

Now raw GPS tracks are usually pretty messy data. For instance, here’s a set of tracks, made over multiple days (since time affects GPS accuracy) of a corner near my house.

or even this set of tracks including the driveway of my house (the red lines are actual paths of the streets as taken from a surveyed map) – note all the scatter in the data, this will come up as an issue in my next post.

Each GPS has various options for recording data and as you can see in this image (I recorded the maximum data) there is a lot of variability. IOW, early on, with my own experiments I came to look at GPS tracks with a bit of skepticism. So tracks I found on the Net I know are not quite right.

So with all this practice and knowledge I set out to create my first virtual trek, the Pacific Crest Trail (which, btw, I did “finish”, as in do the necessary distance on my treadmill). This was years ago and I don’t remember the details but I remember writing my own code to convert the KML (Google Earth) file I’d found into Delorme “route” info. I quickly learned that Delorme couldn’t handle the entire PCT as a single “route” so I had to break it in pieces.

BUT, the key thing was Delorme could convert the waypoints (fortunately closely spaced) to distances. Given the PCT doesn’t follow any “roads” the routing within Delorme itself was useless, but I found a way to get distances from the GPS track and from that I could then convert my cumulative treadmill distances to location. Of course I used Google Earth to “view” the PCT, but: a) at that time Google hadn’t done Streetview yet, and, b) the PCT is a wilderness trail that doesn’t follow any “roads” in the Delorme database. But Delorme was designed to use (the Topo) version for people doing outdoor recreations and thus was happy to have routes that didn’t follow any known paths in their database and still get distances.

So all of this led to where I am now. I hoped to repeat the process but knowing: a) there is a lot more and newer information, mostly from Google, and, b) Delorme only has detailed maps for the USA. So now I had to find a new way to replicate the process I used for the PCT and apply it to the Via Podiensis.

And I’ll end this post with this, to be continued with the explanation of the process I am discovering (still having to experiment some) for Via Podiensis which eventually means I’ll have what I need: a fairly precise table of distances (at roughly 10km intervals) that actually follows the roads, paths and even off-road trails (not known to Google, but I can guess some). It’s a tedious process but for me, with my weird obsessions, an interesting exercise in itself with the ultimate outcome (still a hope but fairly sure I can do it) to create what I need for another ~750km of virtual trek.

 

Glossary Updated

This post describes a recent process to update the glossary found on this blog. I believe a reader should know how a glossary is assembled in order to know how much to  trust its accuracy so I’m trying to be as transparent about process as possible. Furthermore my glossary has two “biases”: 1) it is aimed at terms found in Spain, not any Spanish term from anywhere, and, 2) I (mostly) only include terms I’ve actually found on the hundreds of  menus from restaurants in  Spain I’ve collected and analyzed to create a highly curated corpus. So while the glossary has considerable effort in constructing it naturally it still has errors as it was manually compiled. But I believe it is one of the better and more exhaustive glossaries you’ll find, at least for free on the Net.

After eight more days of work since my post about this effort I decided to call it “done” and update my glossary page as version 4.0. The glossary gained about 150 items, had numerous errors corrected (especially spelling, especially accents), had some definitions changed or enhanced, and adopted my “syntax” to show all the forms of this word under under a single “lemma” (just learned this term from linguistics).

Despite all the work I did there are still mistakes, omissions, inconsistencies in the lemma representations and other errors. This is the challenge of manually editing a large amount of material, even while trying to be very careful. Each time I do this manually I learn a bit more about how I’ll have to create the software to create and manage a properly curated corpus which I’ll need for my translation application.

Not every term in this glossary is really a “translation” to English as often there is no translation. So instead, based on terms I have found in the many menus from Spain restaurants that I’ve analyzed as the “raw” data, I have sometimes had to supply a description instead of either a “definition” or a translation. For instance, I researched and added most of the names of grapes used in Spanish wines, olives used in tapas and cheeses used in various dishes. While one might translate Cabrales as “blue cheese” this isn’t that helpful so descriptions work better.

So almost every term in my glossary I have found in menus. There are more terms in the various glossaries I’ve found and assembled but unless I actually see a term used in a menu in Spain I can’t be certain some term from some other glossary actually applies to Spain. Or, of course, Spanish food terms in other parts of the world may mean something entirely different than they do in Spain and so I’m trying (as best I can) to focus on the vocabulary one would encounter in Spain.

I may do some more “fixes” or additions to this glossary but I don’t expect to do another major revision. As it is this is now one of the largest glossary you’ll find anywhere on the net (and perhaps the easiest to access, just a single, albeit, long webpage, not some more complex access scheme). So while this glossary, like anything you find on the Net, is easily available one should ALWAYS be somewhat skeptical as the editor is human and makes mistakes, so check with authoritative sources for any terms that might really matter for you.

A look at my drill application

Since I’ve mentioned this in multiple posts I thought I’d provide a little more detail. Here’s a screen shot with some food terms.

Ugh, WordPress is hard to get images right, hope this looks OK after saving. Good, for some reason the image looks bad in WordPress’ post editor but I chopped the screenshot to fit and it looks OK after posting.

BTW: Spanish readers out there will note kokotxa in this list which is really Basque, not Castilian which would be cococha.

Anyway, the basic idea is to load a random (though biased to get most effective drilling) set of words and then I visually examine them. Most drills do some sort of “quiz” but this is for me so I just scan the list.

If I don’t instantly know the translation I click the word. That gives me a score of -1 (otherwise if I don’t click a word it gets a score of 0, for appearing but “known”). I don’t “cheat”, since this is just for me, so I don’t need a quiz.

But if I have the least bit of doubt I click and then I see the translation. Then I decide: a) was this a mistake that I clicked and then click Ignore button, b) if I thought I knew the answer but was wrong, then I click the Wrong button and my score becomes -3, and, c) if I really didn’t know at all (or my “guess” was wildly wrong) I click the “no clue” button and get a score of -10).

After I’ve looked at all the words I click Done to record the results. Then I click Drill to get a new set of words (which is more likely to repeat wrongs with scores other than 0). I continue as long as I can stand and then click Save (unless I’m just testing code) and the scores are then added to the XML database.

And if I’m sure I want to record the results then I can use the File menu item to save a new copy of the the XML.  The XML Editor and XML Update are what I use to fix issues in the database itself.

All the drill results are saved in another part of the XML (eventually making it very large, hurrah for having lots of RAM to have all this in memory – I come from the days when RAM was scarce and had to do lots of programming tricks, now I just brute force all this).

Then I have an analysis routine (WIP) to consolidate all the scores over all the drill sessions to find out which words are worst (lots of mistakes, therefore drill more) and which are best (few or no mistakes, so only drill after some time has passed).

While I intend to create other types of drills this is “good enough” to have me looking at a fair portion of my vocabulary every day (todos los días) and thus keep refreshing my wetware memory. I can’t do this very long (so the magenta number on the screen shot is a timer of how long I’ve been doing drills, rarely do I exceed 20 minutes) because I’ll start having “short-term” memory (since my mistakes are more likely to repeat in the drill, by design) and so I begin to “know” them, but not really.

I’m focusing the drill (really the way I’ve created the XML database) on recognizing the Spanish, since, again, my goal is reading menus, not writing them. So my database is (now) poorly structured for doing English drills, which is harder than the Spanish drills, but more useful if I need to be able to ask questions about the menus.

And of course this is all “written” rather than spoken drills and to be really helpful I actually need to know how hablar a camarero but I’m getting there.

Back to menus; a big project

My primary purpose for this blog is to record my progress in developing an application to translate menus in Spain. I worked diligently on this for about nine months but then got into some side-trips in other projects. But now I’m trying to get back to that primary objective.

For 78 days now I’ve also been trying to actually learn Spanish via the nice online application, Duolingo. While this diverted me from my primary task it has been useful. My sister always thought my idea was silly and that instead I should just learn the language. That’s not a bad idea but it looked harder (and more time consuming) than my primary limited work just to read menus, based on the assumption I’d soon be heading to Spain to tour along the route of the Camino de Santiago. Therefore I needed results sooner than I could learn the language.

To build my application I’d first need a large corpus of terms from menus with accurate English equivalents. To do that I’d import the text from websites into a working document and crunch through all the terms. Often that gave me some interesting observations that I was converting to posts, hopefully also interesting to my readers. Obviously there are going to be mistakes in manually collating data so my corpus needed to be carefully curated, with the terms and my “guesses” at translation with a “confidence” factor. Then via the large corpus I could extract the accurate equivalent Spanish to English translations I’d need for the application.

That’s a long slog so a couple of times I went ahead and created a minimally curated “glossary” which I have as a page here at this site. In my searches I found a number of glossaries, or even dictionaries in Spanish, covering food. Years ago when I first got interested in these I just extracted all the glossaries I could find and manually collated them into a single glossary. It was a mess!

The trouble is that food terms in Spanish (my searches) yield results that either don’t apply to Spain’s food dialect or were just wrong. After all any other person who compiles glossaries makes mistakes too. Or I’d make mistakes extracting and collating them. And my lack of any fluency in Spanish meant I often misinterpreted the raw material I was attempting to organize. That previous experience convinced me I needed to be very precise about collating material AND focused on Spain as the source of the raw material and so my idea about creating a corpus evolved.

But in nearly a year I still don’t have that corpus. And without it I can’t build my application. And in the meantime I needed to get some “drill” code done since I reached the point where I was forgetting more than I was learning. And while Duolingo is fairly good for learning Spanish it’s not as good for repeating previous lessons (and their vocabulary). And repetition is the key to learning a language. So I found myself forgetting vocabulary I’d once before acquired.

So I set out to build a drill application, which has some of the same elements I’d need in the translation application. And like compiling glossaries I’ve done this also, in the past – the first time for Italian food terms. So I’ve built drill programs before with only limited success.

The key to a drill program is to be efficient and force me to do repetitions of the vocabulary I know the least well. That’s harder than it sounds. Plus most of the types of drill I did (glorified flashcards, a common language learning technique) took so much time that as my vocabulary grew my repetition, of any particular word, got less and less frequent. Even with an hour a day I could only repeat a fraction of the vocabulary I’d acquired.

So I had some ideas how to improve this and make the drill more efficient. But I needed data even to do the programming. So I fairly quickly assembled the glossary I posted at this blog without being too concerned about its accuracy.

So with that lengthy background now I can describe what I’ve more recently done and the “big project” I’m now doing. I built my first version of the drill application centered around the Duolingo vocabulary. As I’d do each lesson I would fairly careful assemble the “database” (a complex XML) to feed the feed program. For my Duo vocabulary that now contains about 1100 “terms” and 1400 “forms” of those terms. By forms I mean the usual four spellings of adjectives (in Spanish both gender and number) and the first set of conjugations for verbs. Getting all that going for Duo vocabulary drills got me a fairly useful and efficient drill program which is helpful as a supplement to Duolingo.

So then using that code and crunching the glossary I’d assembled here I started on the food terms. And that was a bit of a mess because the glossary sucked.

So to fix this I went back to my 30 or so working documents of all the menus I’d processed. Rather than the more difficult chore of extracting material for a well curated corpus I just quickly (a couple of days) just extracted all the accumulated Spanish. That’s a tedious chore but it does reveal some of the problems of getting “raw” material from the websites. Naturally I found lots of spelling mistakes (easier for me to recognize now that I know a little Spanish) but also the inconsistencies in gender and sometimes number. Also many instances of words are very inconsistent on the use of accents in the Spanish words. My Duolingo study also let me learn the rule that accents sometimes change (for real, not typos) in certain circumstances.

So once I’d compiled all my “words” from all menus I had about 10,000 “raw” bits that I was able to clean up, de-duplicate and consolidate (like all the forms of adjectives under a single “term”) and ended up with about 5500 lines.

Then in a separate process I took the latest (v3.3) copy of my glossary and then combined that with about six other glossaries. That was a chore and resulted in about 4000 entries.

So then I combined these, all the glossary “words” and all the menu “words” and started going through all that by hand. I’m now down with everything through M (since I sort all 9000 or so lines into alphabetic order). I’ve done a few hundred “fixes” to my glossary and about 100 additions. But more importantly all those changes are in my XML “database” for the drill program. With a bit of code I can then extract from that XML to create text I can paste into the glossary page here.

So when I’m finally done with all that tedious manual work I can update my glossary and it will be a big change so I’ll make that the v4.0 version which I believe will be quite a bit better than my current v3.3 but not as good as a curated corpus needs to be. And, really my glossary will then mostly contain words that exist in reference sources (several online dictionaries I use) and/or reconciliation with the other glossaries I found.

Please note, therefore, than my word product is fully derivative from many sources and my editorial work and thus constitutes “original” work. I’m quite conscious of never (almost never) posting anything in this blog that would violate copyright, i.e. the wholesale use of someone else’s glossary.

And now all my material is synchronized – my XML database for the drill program, my derived glossary with reconciliation to other glossaries or reference sources, and I’m only including terms in either place that I’ve found in menus so my product is more closely aligned with Spain dialect and I can exclude other Spanish food terms.

Now, while that isn’t done, I’m back into the code for my drill program. In the case of my Duolingo vocabulary I feed into the drill program I (mostly) know that vocabulary by memory. Duolingo is divided into lessons (aka skills) that require 40 actual drills (to pass the skill and unlock the next one) which means about 800 individual drills. At Duolingo I’ve now done 16,843 “XPs” over 31 skills. On average each skill introduces around 30 words (forms actually). So when I do my “refresh my memory” drills with that vocabulary I have relatively few words I ever mark as uncertain, or worse, “I’m wrong” or “I’m clueless” (really forgot). That means all the scoring I’ve done with that vocabulary has relatively few “errors” and my aggregate score on most terms is 100%.

In contrast I’m much worse on my new food vocabulary. As I’d work on menus I’d “learn” many words, but since I had almost no repetition of those (the most common words appear on many menus so that was my repetition) and I’d done none of my own drill. Now that I have something to feed my drill program I’m getting a lot more “bad” scores. That’s good and bad. It’s bad because it means I don’t know those words very well, by memory. It’s good because now all the scoring of the drills I record in the XML has a lot more data than the drills on Duolingo vocabulary.

So that means back to programming. How do I consolidate tens of thousands of individual drills into some sort of metric that rates each word in the vocabulary as to how well I know it (and/or don’t confuse similar terms). Because I want to drill myself on what I know the least. I don’t very much need to drill on carne or aqua or cerveza or a few hundred other food words and I don’t want to waste the limited time I have for drills (even less than my free time because drill is tedious and I can only tolerate a certain amount each day). So that’s now the algorithms I’m trying to develop so my drill program is even more efficient and therefore more useful.

So while I thought I’d be done with this by now I have probably another week to finish cleaning up my food vocabulary and enhancing up my drill program.  But once I’m done with that I can spend 15-30 minutes every day (or most days) so I get more of the food vocabulary into longer-term memory along with a growing Duolingo vocabulary. Thus I’d hope to have reasonable fluency within a few months so soon I may need to head to some Spanish speaking country to test myself.

Now, note, all this is “reading” (and less “writing”) Spanish. Hearing or speaking is an entirely different problem. But without mastery over much of the vocabulary actual conversation is pretty hopeless. I’d originally assumed I’d have no more audible Spanish than a few phrases and the rest I’d do through reading (plenty of time to study a menu, have to be fast to have conversation).

Now, finally, all this I’m just doing for myself, other than relating some hopefully “interesting” tidbits here in the blog. While I’ve built many software products over my working life all this I’m just doing for myself. But at least, as a derivative from this work, I do hope to end up with the best glossary for food terms in Spain here at this blog as my contribution to others who might need this.

 

Still chugging along the Camino, still learning Spanish

I’ve been so much buried in digressions I haven’t had any time to post. You might remember that my project, which is the primary subject of this blog, is to find as many menus as possible from restaurants in Spain, figure out what they “mean” (not just purely translate), build up a corpus of menu terminology to drive the creation of an application to translate menus.

So much for that, as I haven’t been doing any of that for about a month. In addition I continue to do stationary exercise in my basement to try to stay in shape and/or control my weight (lose a little ideally) and potentially build up to a real walk. So I take my mileage on a treadmill and convert it to a location along the Camino (the French route). While I’ve kept up exercise I’ve meanwhile been digressing into another area that has interfered with my primary goals.

But nonetheless I can report that I’m now at mile 368.9, having covered 21 miles thus far in January. That may not sound like much, given most peregrinos can do 12-20 miles/day but I’ve also done 480 miles in just January on stationary bike or the entire Camino.

So I had planned to do a post when I was around 344 miles, which is then near the cruz de ferro, which as Henri Sebastian (in the movie The Way) says is a place of much significance. For those of you who watched the movie or especially those of you who have actually walked the Camino you know cruz de ferro is a small iron cross at the top of tall wooden pole with a bunch of pebbles at the base. The idea is that pilgrims carry a stone from there starting location and then deposit it along with a prayer. The location happens to also be almost the highest point along the entire route.

It all looks very quaint in the movie but looking at that location via my “virtual” walk (i.e. looking at Google Maps, satellite views and the geotagged photos Google shows; you can search for ‘cruz de ferro’ and see what I’m talking about, I don’t reproduce photos from online sources due to implied copyright) it’s not quite the same as the image of the movie. The site is near a major road and is surrounded by parking lots and picnic areas. The cross itself is unimpressive so only interesting due to its historical perspective. Plus visitors leave a lot of mess at the site so again it’s not so quaint.

Also in the movie a collection of rustic signposts is shown. It turns out that’s just a short distance from the cross in the town of Manjarín (you can search for this to see). It appears to be part of a somewhat bizarre albergue/bar near all those signs, the Manjarín Encomienda Templaria.  That too is a bit less quaint than the movie made it look. So much for fiction.

And this raises an interesting point that I couple with other observations. A “virtual” walk certainly isn’t the same as a real one, but I’ve “seen” enough to get a much better understanding of what the Camino is like. And, frankly, a lot of it isn’t that great. The people who have the spiritual connection to the route don’t care, but for merely a “tourist” who’d like a more physical experience than riding tour buses I now question whether I’d really want to ever walk the Camino.

Or at least the classic (aka French) route. So now I’ve begun to focus on Camino del Norte route. What is still appealing to me is visiting the northern (Atlantic) coast of Spain, from France to Galacia. The country looks prettier (certainly greener) and I think the food would be better. Since my wife doesn’t want to do the walking as a compromise we’ll do part tourist stuff (driving, hitting hot spots like Bilboa) and then some more rural touring in the vicinity of the Camino del Norte and thus have some of the same experience.

But that’s in the future.  Now as to the digressions that are bogging me down.

My original idea was that I could merely focus on a mechanical aid to “translate” the written menus without actually learning Spanish. It’s not that I didn’t want to learn Spanish, I just saw that as too difficult. My sister (RIP) disagreed with my idea and said I should learn the language. So as I recently posted I’ve started to do that since I suspect some conversation with camareros  (waiters) would be required.

But I’m not going to fill this blog with many comments about my efforts. Any reader interested in that language has a lot better resources than I can provide. And my personal issues with it are mostly a digression so I don’t want to fill this blog with my adventures. But I’ll mention a bit.

As I previously posted I found what first appeared to be a good resource for learning a bit of conversational Spanish, which I do think I’d need to be able to order in restaurants. So I’m doing the Duolingo online study and have had decent results, thus far (up to about 600 words now, still struggling with verbs, of course). But as useful as Duolingo is I find that I fairly quickly master their “skills” (aka lessons) but then almost as fast forget most of what I learned. Without repeating some of the vocabulary (or having some other way to practice) I forget.

So, naturally, given an entire lifetime of developing software I began to think about building my own drills. I’ve done this before, several times in fact. Basically I’ve built software “flash cards” but with “intelligent” repetition, where I’ve developed some, not so good, algorithms to maximize drill on the vocabulary (or to some degree grammar) on what I’m not getting. Now learning vocabulary and grammar are helpful but speaking, and worse, hearing Spanish is tough. Duolingo helps a bit for hearing, but Spanish is a language my ear/brain simply don’t get. First of all, most Spanish speakers speak really quickly (this, I’ve found from online sources, is well known in comparison to other languages). And even with Duolingo, the full speed recorded sentences that I have to either translate or simply write what I hear, I miss lots of little bits. I have a terrible time hearing the gender or verb tenses which can be critical. I figure I can botch my pronunciation, as well as gender or conjugation, and probably still be understood, but hearing any response is really going to be tough. But the better I know the vocabulary, without a big mental delay to translate in my head, the more likely I can understand the spoken part. Fortunately there are many Spanish language TV channels in my cable subscription, often with good subtitling, so I have some opportunity, beyond Duolingo, to “practice” hearing, which will be more important to me than actually speaking well.

So, of course I started working on my own software to supplement Duolingo. That does have advantages over just using online courses. To write software one really has to understand some of the structure of the language (“teaching” something to a computer is a good way to find out what I do and don’t understand). So, for instance, I just finished, after considerable study and coding, how to do all the conjugations of regular verbs. And I’ve extracted all the vocabulary I’m learning in Duolingo to put into drills as well. So, IOW, I’ve switched from learning about menus to learning the language to writing code to help me learn the language. Hence, the “digressions” that have diverted my time from my original goal.

But I’m beginning to see the light at the end of that tunnel (plus my coding skills were rusty, so doing my menu translation app will now be a bit easier) and maybe I can get back to my original plan and more, hopefully, interesting posts about menus, instead of my experience with learning Spanish or writing programs.

So stay tuned when I get back on track.

 

Quiero hablar más español

It’s been quite a while since my last post. In addition to all the activities of the holidays I have continued, sporadically, to work on my project that is one of the subjects of this blog. So now I can report some progress.

As a reminder I am (slowly) working my way to develop a mobile application to translate restaurant menus in Spain. To accomplish this I am finding many menus from restaurants in Spain (only Spain to avoid Spanish terms from other Spanish-speaking lands). I translate these using machine translation (mostly Google Translate), then looking for discrepancies in that translation method and using either online dictionaries or Google searches to make better “guesses” about translation. Often terms on menus are not translated accurately (or at all) by machine translation

Once I have accumulated enough raw data (a never ending process) I can create a corpus with Spanish terms and the best English translation I can produce with a “confidence” factor (expressed as a probability). Once the corpus is large enough I’ll write code to extract the best food related (and a few other terms) vocabulary with the highest confidence levels of the accuracy of the translation. Once the vocabulary is “complete” (again a never ending process) I can build my application and then test it on all the menus I’ve accumulated. I’ll judge how well I’ve done this by expecting my translation tool to work much better than other machine translations.

Fine, a useful exercise as someday I hope to actually need to do this while touring Spain, an indefinite “wish” for me. Being able to accurately translate menus, as well as having knowledge of Spain’s cuisine I’d be able to wisely select my choices.

But, my sister, who was quite dedicated to mastering Spanish, albeit focused more on Mexican cuisine, was critical of my approach. Instead of just building an application her strong suggestion was merely that I should just become fluent in Spanish. A fine idea, but one I find very challenging.

Several times in my past I’ve attempted (not very vigorously) to learn Spanish. Since I lived much of my life in California some fluency in Spanish is almost a necessity. I first tried, decades ago, using the best technology then available, i.e. cassette tapes and accompanying text. Ugh. That was a bust. Later as computer tutorials became more common I also tried those, initially using DVDs (as the sound source, later just online voice recordings). These attempts all failed for me.

Why? For one thing I’m not very good at foreign languages. While I studied both French and German in several years of school classes I never got very far with those. My first trip to Germany was a joke at how badly I could either speak or hear. My only real exposure to having to use French was in Québec, during the time when speaking French was a strong “political” issue. I had a bit more success with that partly because everyone, e.g. waiters in restaurants, insisted on French. My stumbling attempts were at least considered a sufficiently sensitive effort that I had some success.

But with Spanish I have a different problem. The sounds of the language are much more alien to my ear – I really can’t hear the words, especially since, it seems to me, native speakers speak very fast and to my ear the words are run together. And, my attempts at speaking were even worse than my attempts to hear and understand. So this has been very discouraging and so I rejected my sister’s urging to just actually learn the language. Additionally I had the joke running through my head that her years of vigorous effort were analyzed by several other people that she had atrocious pronunciation, barely intelligible to a native Spanish speaker. If she couldn’t do it how could I possibly succeed.

BUT, in my effort to translate menus I’ve also found a serious stumbling block. Even with English menus often I need to have some conservation with the server to really understand the menu. And as I translated more and more menus I found this was even more true in Spain. Certainly discussing food with a knowledgeable server adds to the enjoyment of food (another lesson I learned from my sister who was more skilled at cooking than me and through example demonstrated how dining was more pleasant after discussing menu items in some detail).

So I happened to stumble on a new possible learning method. Just happening on an article on the Net about the best apps for “your new smartphone” (naturally timed with the assumption of Christmas gifts) I discovered Duolingo. Previously I’d done the demos with several of the subscription or purchased online tools with little success. But at least: a) Duolingo was free, and, b) it was available for my phone and so I could do the exercises at any time, not just during some study time while on my computer.

So I downloaded the app (both to phone and multiple computers) and committed myself to really giving an earnest effort to learn, at least some basic Spanish. Now, as best I know, traveling in Spain in the larger cities, especially those popular with tourists, probably doesn’t require speaking or hearing Spanish. When i visited Portugal I knew zero Portuguese but managed to get by OK (with some help from hotel staff making phone calls for me). And I managed to get by in both Japan and China, although with considerable help from the people I was visiting.

But my interest in visiting Spain is out in the countryside, initially focusing on the Camino de Santiago (the French route). Now I’m looking more at the Del Norte route since that part of Spain is more appealing to me that the dull plodding through country that looks a bit too much like the Great Plains or Central Valley of California. In such areas I would expect that at least some minimal conversational skill would be necessary. My hope would be: a) I could ask Spanish speakers to speak more slowly and thus hear each word, and, b) that my poor pronunciation wouldn’t prevent them from (mostly) understanding me.

So I’ve now worked as hard as I can on Duolingo. I strongly recommend this for anyone following my blog who might have the same need, especially as it is free (gracias to the community who create these lessons). I’ve made it through 12 days and 12 of the lessons. Duolingo requires a LOT of repetition and thus this forces me to work hard enough at estudio that I actually have made some progress.  Even the sentence I used as the title of this post would have been impossible for me prior to Duolingo.

In the first part of each exercise Duolingo introduces one to vocabulary (and without the more academic approach to grammar, i.e. simple conjugation of verbs). Then the exercises move more and more to responding to spoken phrases or sentences by: a) writing what was said in English, and, b) much harder, writing what was said in Spanish. Each exercise gets steadily harder making it difficult to “guess” and thus requiring actually learning something, especially when one has to actually type the Spanish (from an utterance), especially being picking about getting gender and verb conjugation right. The sheer repetition is working for me.

Despite my best progress ever attempting to learn Spanish I: a) still find it difícil to “hear” the utterance spoken at full speed.  I often either cannot hear the spaces between words or miss subtle bits (I really have trouble hearing una vs un). But since I must get every drill question right before I can proceed I muddle through. So thus far Duolingo reports I’ve now encountered 308 words (many useless for my purpose, also they count each version of a verb as a separate word). Thus far, as far as verbs go I’m still only in the present tense and with the singular persons (figuring out at usted is third person like él or ella was fun since Duolingo mostly uses the informal second person tú  as ‘you’, which often would be rude for me to use in conversation).

While Duolingo focuses on conversation instead of the typical more “academic” language study (all the grammar details, especially conjugations) I’ve done more exploration with other tools (especially spanishdict.com and Wikipedia) to go beyond the Duolingo simple lessons. I’m accumulating some of my own “lessons” to supplement the Duolingo lessons.

Now another challenge for me is that I’ve also learned, in past language learning efforts, that I’m fairly good at immediate duration memory. So while I’m intensely involved I learn to recognize many words. Unfortunately weeks later I’ve forgotten most of those. So, with Duolingo I actually repeat finished exercises to continue repetition which is key.

BUT, repeating everything is time-consuming and not that helpful. The real repetition I need to do is the vocabulary (or sometimes grammar) that I do badly. So now I’m thinking about another bit of programming for my own learning tool.

Once before I built a fairly complex bit of code to extend my English vocabulary. Using something built into Kindle I would mark English words that I either didn’t know at all (like reading more “academic” texts that use more esoteric vocabulary) or that I wasn’t really sure about. Kindle had a drill application that accumulated the words I’d mark as I encountered them in some book. But the Kindle drill, like Duolingo, wasn’t very “smart” about focusing my drill time on the words that gave me the most trouble. So in my own app I developed a scoring system that adjusted my drill to the words I most often missed and also then made sure all but the easiest (for me) words were at least repeated some. I spent a lot of time tuning how that algorithm worked but never was completely satisfied with it.

So with Duolingo as a model (incomplete for what I need) and all my past efforts at learning languages I soon will begin to build my study app (a fancy version of the classic flashcards, especially for verbs and gender). I can move all my Duolingo vocabulary to that app, plus much of what I’ve accumulated from menu study, plus just grabbing more words not found in either source from either: a) various lists I’ve found of the “most common” Spanish words, or, b) from going through a couple of dictionaries, tourist phrase books and grammar books I’ve purchased for my Kindle.

Eventually I would expect my drill app to be sufficient to potentially get by in parts of Spain where I might not find any English speakers. One thing I have learned from my foreign travel is that travel itself (public transportation, getting directions) often requires speaking to people who don’t know English (say, unlike typical tourist destinations, i.e. city hotels, museums and restaurants).

But all this is just a start. I know, largely from my experience in Québec that “immersion” is the real way to learn a language. To be someplace where there is no English mandates that I at least stumble through some sort of conversation to get what I need. Mi esposa loved her weeks in Oaxaca and wants to go back (which I’ve resisted) so perhaps I’ll give in and make the trip she wants as preparation for Spain (just as Québec can be a shorter preparation trip for going to France).

So, I won’t belabor this point much more in posts since I’ve focused this blog on food in Spain and the Camino. My efforts to learn a language are probably even more boring to my readers. But I will supplement some of my posts purely about food terms with a bit more of the conversational stuff I pick up through this other study.