Glossary Updated

This post describes a recent process to update the glossary found on this blog. I believe a reader should know how a glossary is assembled in order to know how much to  trust its accuracy so I’m trying to be as transparent about process as possible. Furthermore my glossary has two “biases”: 1) it is aimed at terms found in Spain, not any Spanish term from anywhere, and, 2) I (mostly) only include terms I’ve actually found on the hundreds of  menus from restaurants in  Spain I’ve collected and analyzed to create a highly curated corpus. So while the glossary has considerable effort in constructing it naturally it still has errors as it was manually compiled. But I believe it is one of the better and more exhaustive glossaries you’ll find, at least for free on the Net.

After eight more days of work since my post about this effort I decided to call it “done” and update my glossary page as version 4.0. The glossary gained about 150 items, had numerous errors corrected (especially spelling, especially accents), had some definitions changed or enhanced, and adopted my “syntax” to show all the forms of this word under under a single “lemma” (just learned this term from linguistics).

Despite all the work I did there are still mistakes, omissions, inconsistencies in the lemma representations and other errors. This is the challenge of manually editing a large amount of material, even while trying to be very careful. Each time I do this manually I learn a bit more about how I’ll have to create the software to create and manage a properly curated corpus which I’ll need for my translation application.

Not every term in this glossary is really a “translation” to English as often there is no translation. So instead, based on terms I have found in the many menus from Spain restaurants that I’ve analyzed as the “raw” data, I have sometimes had to supply a description instead of either a “definition” or a translation. For instance, I researched and added most of the names of grapes used in Spanish wines, olives used in tapas and cheeses used in various dishes. While one might translate Cabrales as “blue cheese” this isn’t that helpful so descriptions work better.

So almost every term in my glossary I have found in menus. There are more terms in the various glossaries I’ve found and assembled but unless I actually see a term used in a menu in Spain I can’t be certain some term from some other glossary actually applies to Spain. Or, of course, Spanish food terms in other parts of the world may mean something entirely different than they do in Spain and so I’m trying (as best I can) to focus on the vocabulary one would encounter in Spain.

I may do some more “fixes” or additions to this glossary but I don’t expect to do another major revision. As it is this is now one of the largest glossary you’ll find anywhere on the net (and perhaps the easiest to access, just a single, albeit, long webpage, not some more complex access scheme). So while this glossary, like anything you find on the Net, is easily available one should ALWAYS be somewhat skeptical as the editor is human and makes mistakes, so check with authoritative sources for any terms that might really matter for you.

Advertisements

A look at my drill application

Since I’ve mentioned this in multiple posts I thought I’d provide a little more detail. Here’s a screen shot with some food terms.

Ugh, WordPress is hard to get images right, hope this looks OK after saving. Good, for some reason the image looks bad in WordPress’ post editor but I chopped the screenshot to fit and it looks OK after posting.

BTW: Spanish readers out there will note kokotxa in this list which is really Basque, not Castilian which would be cococha.

Anyway, the basic idea is to load a random (though biased to get most effective drilling) set of words and then I visually examine them. Most drills do some sort of “quiz” but this is for me so I just scan the list.

If I don’t instantly know the translation I click the word. That gives me a score of -1 (otherwise if I don’t click a word it gets a score of 0, for appearing but “known”). I don’t “cheat”, since this is just for me, so I don’t need a quiz.

But if I have the least bit of doubt I click and then I see the translation. Then I decide: a) was this a mistake that I clicked and then click Ignore button, b) if I thought I knew the answer but was wrong, then I click the Wrong button and my score becomes -3, and, c) if I really didn’t know at all (or my “guess” was wildly wrong) I click the “no clue” button and get a score of -10).

After I’ve looked at all the words I click Done to record the results. Then I click Drill to get a new set of words (which is more likely to repeat wrongs with scores other than 0). I continue as long as I can stand and then click Save (unless I’m just testing code) and the scores are then added to the XML database.

And if I’m sure I want to record the results then I can use the File menu item to save a new copy of the the XML.  The XML Editor and XML Update are what I use to fix issues in the database itself.

All the drill results are saved in another part of the XML (eventually making it very large, hurrah for having lots of RAM to have all this in memory – I come from the days when RAM was scarce and had to do lots of programming tricks, now I just brute force all this).

Then I have an analysis routine (WIP) to consolidate all the scores over all the drill sessions to find out which words are worst (lots of mistakes, therefore drill more) and which are best (few or no mistakes, so only drill after some time has passed).

While I intend to create other types of drills this is “good enough” to have me looking at a fair portion of my vocabulary every day (todos los días) and thus keep refreshing my wetware memory. I can’t do this very long (so the magenta number on the screen shot is a timer of how long I’ve been doing drills, rarely do I exceed 20 minutes) because I’ll start having “short-term” memory (since my mistakes are more likely to repeat in the drill, by design) and so I begin to “know” them, but not really.

I’m focusing the drill (really the way I’ve created the XML database) on recognizing the Spanish, since, again, my goal is reading menus, not writing them. So my database is (now) poorly structured for doing English drills, which is harder than the Spanish drills, but more useful if I need to be able to ask questions about the menus.

And of course this is all “written” rather than spoken drills and to be really helpful I actually need to know how hablar a camarero but I’m getting there.

Back to menus; a big project

My primary purpose for this blog is to record my progress in developing an application to translate menus in Spain. I worked diligently on this for about nine months but then got into some side-trips in other projects. But now I’m trying to get back to that primary objective.

For 78 days now I’ve also been trying to actually learn Spanish via the nice online application, Duolingo. While this diverted me from my primary task it has been useful. My sister always thought my idea was silly and that instead I should just learn the language. That’s not a bad idea but it looked harder (and more time consuming) than my primary limited work just to read menus, based on the assumption I’d soon be heading to Spain to tour along the route of the Camino de Santiago. Therefore I needed results sooner than I could learn the language.

To build my application I’d first need a large corpus of terms from menus with accurate English equivalents. To do that I’d import the text from websites into a working document and crunch through all the terms. Often that gave me some interesting observations that I was converting to posts, hopefully also interesting to my readers. Obviously there are going to be mistakes in manually collating data so my corpus needed to be carefully curated, with the terms and my “guesses” at translation with a “confidence” factor. Then via the large corpus I could extract the accurate equivalent Spanish to English translations I’d need for the application.

That’s a long slog so a couple of times I went ahead and created a minimally curated “glossary” which I have as a page here at this site. In my searches I found a number of glossaries, or even dictionaries in Spanish, covering food. Years ago when I first got interested in these I just extracted all the glossaries I could find and manually collated them into a single glossary. It was a mess!

The trouble is that food terms in Spanish (my searches) yield results that either don’t apply to Spain’s food dialect or were just wrong. After all any other person who compiles glossaries makes mistakes too. Or I’d make mistakes extracting and collating them. And my lack of any fluency in Spanish meant I often misinterpreted the raw material I was attempting to organize. That previous experience convinced me I needed to be very precise about collating material AND focused on Spain as the source of the raw material and so my idea about creating a corpus evolved.

But in nearly a year I still don’t have that corpus. And without it I can’t build my application. And in the meantime I needed to get some “drill” code done since I reached the point where I was forgetting more than I was learning. And while Duolingo is fairly good for learning Spanish it’s not as good for repeating previous lessons (and their vocabulary). And repetition is the key to learning a language. So I found myself forgetting vocabulary I’d once before acquired.

So I set out to build a drill application, which has some of the same elements I’d need in the translation application. And like compiling glossaries I’ve done this also, in the past – the first time for Italian food terms. So I’ve built drill programs before with only limited success.

The key to a drill program is to be efficient and force me to do repetitions of the vocabulary I know the least well. That’s harder than it sounds. Plus most of the types of drill I did (glorified flashcards, a common language learning technique) took so much time that as my vocabulary grew my repetition, of any particular word, got less and less frequent. Even with an hour a day I could only repeat a fraction of the vocabulary I’d acquired.

So I had some ideas how to improve this and make the drill more efficient. But I needed data even to do the programming. So I fairly quickly assembled the glossary I posted at this blog without being too concerned about its accuracy.

So with that lengthy background now I can describe what I’ve more recently done and the “big project” I’m now doing. I built my first version of the drill application centered around the Duolingo vocabulary. As I’d do each lesson I would fairly careful assemble the “database” (a complex XML) to feed the feed program. For my Duo vocabulary that now contains about 1100 “terms” and 1400 “forms” of those terms. By forms I mean the usual four spellings of adjectives (in Spanish both gender and number) and the first set of conjugations for verbs. Getting all that going for Duo vocabulary drills got me a fairly useful and efficient drill program which is helpful as a supplement to Duolingo.

So then using that code and crunching the glossary I’d assembled here I started on the food terms. And that was a bit of a mess because the glossary sucked.

So to fix this I went back to my 30 or so working documents of all the menus I’d processed. Rather than the more difficult chore of extracting material for a well curated corpus I just quickly (a couple of days) just extracted all the accumulated Spanish. That’s a tedious chore but it does reveal some of the problems of getting “raw” material from the websites. Naturally I found lots of spelling mistakes (easier for me to recognize now that I know a little Spanish) but also the inconsistencies in gender and sometimes number. Also many instances of words are very inconsistent on the use of accents in the Spanish words. My Duolingo study also let me learn the rule that accents sometimes change (for real, not typos) in certain circumstances.

So once I’d compiled all my “words” from all menus I had about 10,000 “raw” bits that I was able to clean up, de-duplicate and consolidate (like all the forms of adjectives under a single “term”) and ended up with about 5500 lines.

Then in a separate process I took the latest (v3.3) copy of my glossary and then combined that with about six other glossaries. That was a chore and resulted in about 4000 entries.

So then I combined these, all the glossary “words” and all the menu “words” and started going through all that by hand. I’m now down with everything through M (since I sort all 9000 or so lines into alphabetic order). I’ve done a few hundred “fixes” to my glossary and about 100 additions. But more importantly all those changes are in my XML “database” for the drill program. With a bit of code I can then extract from that XML to create text I can paste into the glossary page here.

So when I’m finally done with all that tedious manual work I can update my glossary and it will be a big change so I’ll make that the v4.0 version which I believe will be quite a bit better than my current v3.3 but not as good as a curated corpus needs to be. And, really my glossary will then mostly contain words that exist in reference sources (several online dictionaries I use) and/or reconciliation with the other glossaries I found.

Please note, therefore, than my word product is fully derivative from many sources and my editorial work and thus constitutes “original” work. I’m quite conscious of never (almost never) posting anything in this blog that would violate copyright, i.e. the wholesale use of someone else’s glossary.

And now all my material is synchronized – my XML database for the drill program, my derived glossary with reconciliation to other glossaries or reference sources, and I’m only including terms in either place that I’ve found in menus so my product is more closely aligned with Spain dialect and I can exclude other Spanish food terms.

Now, while that isn’t done, I’m back into the code for my drill program. In the case of my Duolingo vocabulary I feed into the drill program I (mostly) know that vocabulary by memory. Duolingo is divided into lessons (aka skills) that require 40 actual drills (to pass the skill and unlock the next one) which means about 800 individual drills. At Duolingo I’ve now done 16,843 “XPs” over 31 skills. On average each skill introduces around 30 words (forms actually). So when I do my “refresh my memory” drills with that vocabulary I have relatively few words I ever mark as uncertain, or worse, “I’m wrong” or “I’m clueless” (really forgot). That means all the scoring I’ve done with that vocabulary has relatively few “errors” and my aggregate score on most terms is 100%.

In contrast I’m much worse on my new food vocabulary. As I’d work on menus I’d “learn” many words, but since I had almost no repetition of those (the most common words appear on many menus so that was my repetition) and I’d done none of my own drill. Now that I have something to feed my drill program I’m getting a lot more “bad” scores. That’s good and bad. It’s bad because it means I don’t know those words very well, by memory. It’s good because now all the scoring of the drills I record in the XML has a lot more data than the drills on Duolingo vocabulary.

So that means back to programming. How do I consolidate tens of thousands of individual drills into some sort of metric that rates each word in the vocabulary as to how well I know it (and/or don’t confuse similar terms). Because I want to drill myself on what I know the least. I don’t very much need to drill on carne or aqua or cerveza or a few hundred other food words and I don’t want to waste the limited time I have for drills (even less than my free time because drill is tedious and I can only tolerate a certain amount each day). So that’s now the algorithms I’m trying to develop so my drill program is even more efficient and therefore more useful.

So while I thought I’d be done with this by now I have probably another week to finish cleaning up my food vocabulary and enhancing up my drill program.  But once I’m done with that I can spend 15-30 minutes every day (or most days) so I get more of the food vocabulary into longer-term memory along with a growing Duolingo vocabulary. Thus I’d hope to have reasonable fluency within a few months so soon I may need to head to some Spanish speaking country to test myself.

Now, note, all this is “reading” (and less “writing”) Spanish. Hearing or speaking is an entirely different problem. But without mastery over much of the vocabulary actual conversation is pretty hopeless. I’d originally assumed I’d have no more audible Spanish than a few phrases and the rest I’d do through reading (plenty of time to study a menu, have to be fast to have conversation).

Now, finally, all this I’m just doing for myself, other than relating some hopefully “interesting” tidbits here in the blog. While I’ve built many software products over my working life all this I’m just doing for myself. But at least, as a derivative from this work, I do hope to end up with the best glossary for food terms in Spain here at this blog as my contribution to others who might need this.

 

More trail photos; < 100 miles to go

I was close in the previous post when I declared I’d crossed the border into to Galicia, but now I do have less than 100 miles to go on my virtual hike. At the slow pace I’m doing on machines that is a couple of months.

But this post is mostly about photos in my continuing series of photos I’m finding in my personal archive of trails (or crude roads). As it’s said in the movie, “the road is among our oldest tropes”. There is some about a path that holds us, compels us to move forward on that path. So here’s the first of this series:

This is a short trail along a river we found on the way to the Natchez Trace in Mississippi. It was a pleasant walk through the woods. I don’t much like photos that include me but in this case I relented. But from behind it could be anybody.

So let’s get something a bit more visible:

This was an unexpected and quite beautiful hike in Guadalupe Mountains National Park just across the border of New Mexico into western Texas. While this photo doesn’t show the fantastic fall color we encountered, totally unexpected for just a dry place, it is one of the few pictures of me on the trail, taking photos of course. Here the trail crosses a dry riverbed that probably experiences the classic rapid flooding when there are rains. This is along the route through McKittrick Canyon which I can highly recommend, especially in the fall.

And as, I hope, the last time I do this here is another hike, this time across country on no trail at all:

This couldn’t be in a more different location. Here we’re hiking overland in the Big Snowy Mountains of Wyoming. I’ve visited this area multiple times (the nearest big mountains to my home in flat Nebraska). The interpretative signs there claim that at one point in Earth’s history these were the highest mountains on the planet.

This shot is late fall and there is even a bit of snow falling. The purpose of going cross country to to “nowhere” is indicated by the invisible object I’m holding, a Garmin eTrek GPSr. We’re headed to a “dashpoint”, a completely arbitrary coordinate on the earth to try to reach if you can. Usually we reach these points with a car but this was a case where the dashpoint was on public land and thus a place where we could hike.

Actually this was a tough hike because much of the area was even more rocky than you see in this photo. Without an actual trail scrambling over rocks can be very tiring. But we found the dashpoint and returned to the car (had to drive to civilization to file our reports) and escaped the snow that closes in just after we were there.

Looking at all the photos of the Camino, the closest I’ve come to actually trekking there, it’s very pleasant, but if one seeks some beautiful country off the beaten path it’s hard to beat the USA. This isn’t some patriotic chauvinism, just a simple statement of geography. When I see the area around the Camino and realize how long people have been there, with terrain altering technology, part of the beauty of the “nowhere” in USA (or even more so in our neighbor to the north) is simply that people, at least with much technology, have been here such a brief time and thus so much of the land is only slightly altered.

In the trail I showed in the previous post, a very symbol of “civilization” (the railroad) has retreated and disappeared and nature has reclaimed the narrow corridor where once steam prevailed with greenery and now fortunately a place of respite of trekkers.

So enjoy these photos because of the 30,000 I have (with a few worth posting) these are probably the only ones where I’ll be in the shot.

Made it to Galicia; Another trail picture

I’ve now pushed through 393.3 miles on my virtual Camino (i.e. treadmill in the basement) thus putting me just past O Cebreiro which is just over the border into Galicia, the last autonomous community before reaching Santiago in just about 100 miles. By “reached”, of course, I mean I’ve done the distance (from Saint-Jean-Pied-de-Port) along a GPS track of the Camino. I’d love to be doing this for real, but at least I get to “experience” some of what this trek is like, checking out restaurant menus along the way, which is the primary topic of this blog.

And at least I’ve gotten some idea of what the trek looks like. That is, converting treadmill miles into locations and then using Google Street View to “look around” I’s also decided that most of the trek is in pretty boring country, not much different than what is around here. However, since reaching Ponferrada from the East where the way begins to enter the mountains the country has been much prettier. But also, interestingly, it seem that lodging and restaurants are a bit higher quality as well. I deduce that’s because most of the escorted trips along the Camino occur in this area, as only about 100km are required in order to qualify for the compostelana (diploma) and so trekker who want a bit more luxury and a lot less walking start much closer to Santiago. Which, of course, is a “cheat” as getting there (as opposing to being there) is the whole point of the trek.

But here’s another of my trail photos, one of my favorite:

OK, so it’s pretty ordinary looking spot and not at all spectacular. So why is it one of my favorites?

Well, it’s accessible and pleasant walking, that’s why. This is one of many bridges on the Wabash Trail, which goes from the south side of Council Bluffs Iowa to the Missouri border. It’s a Rails-to-Trails recreation project which are sponsored all over the USA.

You see when railroads were first built in the US the land was granted by either the state or the Federal government, often with a provision that if the railroad is abandoned the land reverts back to government (thus public) ownership. Now Iowa is the most intensely farmed state in the USA which means very little land is in its natural condition (and it’s all private, so no access for recreation). So this tiny corridor of “wild” for the Wabash Trail is a real jewel.

Also, though it may just be urban legend, the original trains that used this route burned coal (or even wood) and so burning embers escaped their smokestack. As a result the railroad had a wide buffer of land to avoid setting stuff on fire. Today, given that entire right of way is abandoned woods have reclaimed that area, except for the trail itself.

SO, even though there are farms and houses everywhere along this trail it does a good job of pretending to be wilderness. And all that plant growth creates enough shade that the trail is much cooler for walking than out in the sun (one of the obvious drawbacks of so much of the Camino, exposed to intense sunshine).

This particular photo is where I stopped for a brief rest (that’s my stuff on the bridge). The bridges were for the trains and have been reclaimed and converted for foot and bicycle traffic, which is handy, not having to fork creeks. In addition to the buffer of woods along this trail, often it is cut into the hills so the train had a level grade and that also increases the isolation.

I’ve walked almost all of this trail, although only in intervals, never end-to-end. The problem, compared to the Camino, is there are no accommodations along this trail. Even on a bike it would be hard to cover all of it in a single day and walking is a multi-day trek. While there is so access to food and drink along the trail the only way to walk all of it would be to have someone drive to meet your and take you to some overnight lodging. That kinda defeats the point of it.

This bridge is on the longest stretch I’ve done in one trip, about 15 miles, where I had someone drop me off and then meet me in the town Malvern where we had a pleasant lunch with a couple of craft brews. I wanted to push for 20 miles but my ride wasn’t going to wait for another two hours, so this was the best I could do. Of course one other approach would be to get my ride to haul my bike down to my turnaround spot and so walk one way and bike back, but that’s a lot of trouble. So while I like hiking on this trail: a) having to drive 30 miles to get to it, and, then, b) the logistics are impossible as a long hike is part of the reason the Camino, just from the POV of hiking, is attractive.

When the rails were removed the rock bed under the rails was left and then covered with a crushed limestone aggregate. So actually the walking surface is quite pleasant. The trail is well drained so rarely muddy but it’s much “softer” walking than paved roads would be. Again, with all my StreetView studies of the Camino much of that route is NOT very good walking and certainly walking on streets and dodging cars is not my idea of a good trek.

So while this Wabash Trail may not have the history or significance or the experience of a different country I’m grateful it exists and provides some opportunity to move on foot outside instead always in the basement on a treadmill. Of course, right now it’s buried in snow and it’s nearly 0F outside so I’ve got a month or two before I set foot on this trail again.

A Camino not taken

Since I lived in California, in Palo Alto, not far from the major street El Camino Real I have known that ‘camino’ means way or route or road. What I’ve now learned is that it is derived from the verb caminar and the first person singular (yo) conjugation is camino; IOW, it also means “I walk”. I was born in the city of Amarillo Texas which I now (mostly) know how to pronounce correctly and that it is the masculine singular adjective ‘yellow’. Interesting how Spanish has been all around me.

But that’s not what this post is about. As I’ve mentioned I’ve gone off now on several digressions from my original project and subject of this blog – that is a virtual trek along the Camino de Santiago, decoding restaurant menus along the way so I can produce a food specific translation tool. I haven’t dropped that project and from time to time, given I’m still putting in miles on my treadmill which I convert to distance along the Camino route I do check what restaurants I “encounter”, that is via Google maps and StreetViews and their ratings and most importantly user submitted photos. I’ve seen thousands of typical mom-and-pop Spanish dishes but with the exception of a Spain oriented restaurant in Columbus Ohio I have yet to actually taste any comida español.

What I have been doing, after getting a new computer to use in my programming projects, is going back through 30,000 old digital photos and selecting those that either are visually interesting or interesting as reminders of my travels. So, in the absence of any other posts getting created, I thought I could just try adding a few of these photos. I’ve had a fondness for photos, otherwise fairly boring, of roads or, better, trails I do manage to walk. So I figured I’d post all my photos of trails (many more interesting than much of the Camino) and when I run out of those then some roads.

So I’ll start with this one:

Note: I still haven’t quite figured out how WordPress resizes photos so I have much higher resolution photos than I’m posting and I’ll have to figure out the trick to getting better quality.

I picked this shot because it fits my title – the footpath bifurcates with a more obvious trails and a lesser one. Naturally I hiked the lesser one (this was about four years ago).

The location is the Theodore Roosevelt National Park in North Dakota, USA. I camped there for about a week, often overrun by bison who decided the fresh spring grass in my campsite was what they wanted to eat. So I sometimes retreated to my car as a thin nylon tent is not much to stop a bison. This park has its name from the fact that President Teddy Roosevelt, as a young man, had various physical weaknesses and he chose to go to North Dakota (not then a National Park) and try his hand at ranching. Despite definitely being a “dude” from a rich East Coast family, tough and later rough-rider Teddy eventually impressed the locals with his tenacity and energy. So when the land was transferred to the US Park Service naturally it was named after him.

Now the area of the park is interesting because it appears almost out of nowhere in the middle of very flat plains of western North Dakota. While, at the bottom (where that trail is, the first photo), it can appear to be mountainous but it is actually canyons created by the Little Missouri River, which eventually flows into the Missouri River which is the border of my home state, about 10 miles away. So here’s a sample of the larger area from the top of the canyons:

Note: This photo still looks horribly fuzzy to me despite having 1920 pixel wide resolution – what is WordPress doing?

Since this is spring it’s quite green at this time of year but this is mostly prairie with some cedar trees. It was consider fairly inhospitable to any life and was known as “badlands”. But now there is actually a Badlands National Park in South Dakota, which I visited on the way to TRNP and it looks rather different.

So this is my first “camino” post with what looks like disappointing photos (in WordPress edit mode) and hopefully these images will look better in the finished post. Otherwise I’m clueless, now, how to get decent photos that I have (from a Nikon, not a cell phone) into posts.

I won’t do that many of these so it’s not just filler until I get back to Spain restaurant menus but I do enjoy (me gusta caminar) and I have some photos to prove it.

p.s. Since I mentioned bison (incorrectly called ‘buffalo’) I suppose I should include a picture of one of those sitting in my campsite near my fire pit.

retreat-1 026(16-9)

This is a real wild (and very large) animal. I was amused by signs warning tourists not to bother the wildlife, but there was no sign explaining what to do if the wildlife was bothering us!

Something different

One problem with a virtual trek is that I don’t get an chance to take my own photos. I can’t post photos of other people so I can only talk about my “trek”. So photos to follow, but a little preface (scroll down if you’re impatient for the good stuff).

Well, actually I do go places. And I take photos. I very much enjoy the posts of loyal readers with fantastic photos, places I’d love to see, but at least I can experience through other people’s postings. So here’s a few to return the favor.

So, I recently got a new computer and I really wanted a new and fresh set of photos for my screen saver on my new large display. So I dug into my archive of over 40,000 photos to pick a few of the best. It was an adventure to look back over almost 20 years and a variety of digital cameras.

And while Spain, my current interest, is not much like Texas, there is some resemblance. When I first moved to Nebraska from the San Francisco Bay Area (Los Altos to be specific) I was really depressed. Withing an hour of my old house I could find, even on foot, beautiful country. Within a few more hours I could either be cross-country skiing or sipping wine in Napa or riding my bike along the Pacific Coast. In contrast even 6-8 hours of driving from Omaha it’s still just cornfields. So I went crazy, also given it is winter in Nebraska, and I threw my backpacking gear in the car and headed south. Three days later I found myself in Big Bend National Park in Texas. Now I get to say anything I want about Texas because I was born there in a city called Amarillo, needless to say nowhere near the correct Spanish pronunciation of the adjective, ‘yellow’. Texas is a huge state, probably bigger than Spain, so I’d never been to Big Bend and it was a thrill to visit. Later I convinced my wife that visiting some place where I’d been sleeping in a tent on the ground was still a fun vacation.

As many of my loyal Readers are not from the USA, you might still know that our insane president (pretender) wants to build a wall along the USA and Mexico border. Actually there is a “barrier” on most of the border except Texas. Folks in Texas hate “imminent domain” so even putting up fences has run into local opposition. But the real “barrier” is nature, fierce, but beautiful.

But far more important a big chunk of the US/Mexican border is a fantastically beautiful place, either the National Park  or the Texas State Park. Twice I’ve visited this area and the second time I had a digital camera so here are so photos to give you feel for this beautiful place AND how impossible the terrain is for any sort of hordes crossing the border. I’m not sure I’ve seen any border that is LESS possible for easy crossing. And it would be horrible to spoil the beauty of this area with an utterly useless Wall just to make MAGAs in Michigan (who’ve never been anywhere near the border) happy.

So here are my photos, please ENJOY this beautiful place. And for once I can contribute something to see.

Ick. There is something I don’t understand about posting photos. These photos look like a blurry mess, but not what I have in my files (these are originally 15Mpixel files from a Nikon). I’m trying various things to make them look like I see them, not sure what WordPress needs.

Here are a couple of scenic vistas in the general vicinity of the border:

Actually this isn’t quite near the border, it’s the Chisos Basin in Big Bend National Park but that’s where I was for this fantastic sunrise (it is about 5AM and a long exposure). Chisos Basis is the only accommodation in the park and is surrounded by mountains on all sides. The air is incredibly clear, and, of course dry (it is desert) so sunrises and sunsets are fantastic.

Do I mention you can see the sky here. My photos don’t even come close to the experience you can have, standing in the desert and seeing sky everywhere.

But now we come to the border.

From the US side this is looking north, in Texas State Park, with the Rio Grande behind us.

(Note: these photos look crummy to me, but they’re not all blurry like I see them as I make this post. I guess I don’t understand how to incorporate good photos in WordPress – click on the photo for a better one, but still much lower resolution than my original).

You can just barely see the river here, but this is a hint of surrounding country.

And here it is = the border, the Rio Grande – you can see the streams of immigrants flooding across. They come well equipped with climbing gear.

Again does that look like the kind of river you’re going to see a migrant caravan of women and children rushing across? Go luck kids.

A few miles down the river, still rough country – great sightseeing on the highway on the US side, pretty rough country with miles of desert on the Mexican side.

Here the Rio Grande might be easy to cross, but

here, not some much. This is the St Elena Gorge, as awesome cleft with steep cliffs on both sides of the border. When I first went to Big Bend my parents, who were “snowbirds” (people in cold climates with RVs who head to warmer climes near the border) warned my about Mexicans stealing my car. When I saw this gorge my reaction was – GOOD LUCK. A huge expanse of fierce desert to get to this gorge and then technical rock climbing to get to the US side. Hey, anyone intrepid enough to make that journey can steal my car! Needless to say there were no car thieves and anyone except USA tourists anywhere near this spot.

Maybe this crossing is lot easier, but still seriously demanding of outdoor skills.

And in case these barriers are not discouraging here’s a few other things you would face.

 

Amazing, this guy, about the size of my hand was just sauntering across the highway. Supposedly they’re fairly gentle but I wouldn’t want to put that idea to the test.

And, just more fun

These are called “horse crippler” cactus, and for good reason. Anyone daring this part of the world needs serious boots (and a good eye not to step on these).

A few times in my life I just zipped through the southwestern deserts of the USA but when I finally visited, slowly, on foot, these areas I was stunned at their beauty, something you have to see close up and in sync with nature.

The idea of putting a 10m high wall across this country, despite its stupidity for all the other reasons, is a criminal offense against the sanctity of nature. Spain has its beautiful spots, which I still hope to see, but the USA has fantastic spots as well.

Now, these photos are yucky, so I’m going to see if I can make them look better, more like I see them (I do have a rather good Nikon camera to shoot this stuff, not some two-bit cellphone camera).

Still chugging along the Camino, still learning Spanish

I’ve been so much buried in digressions I haven’t had any time to post. You might remember that my project, which is the primary subject of this blog, is to find as many menus as possible from restaurants in Spain, figure out what they “mean” (not just purely translate), build up a corpus of menu terminology to drive the creation of an application to translate menus.

So much for that, as I haven’t been doing any of that for about a month. In addition I continue to do stationary exercise in my basement to try to stay in shape and/or control my weight (lose a little ideally) and potentially build up to a real walk. So I take my mileage on a treadmill and convert it to a location along the Camino (the French route). While I’ve kept up exercise I’ve meanwhile been digressing into another area that has interfered with my primary goals.

But nonetheless I can report that I’m now at mile 368.9, having covered 21 miles thus far in January. That may not sound like much, given most peregrinos can do 12-20 miles/day but I’ve also done 480 miles in just January on stationary bike or the entire Camino.

So I had planned to do a post when I was around 344 miles, which is then near the cruz de ferro, which as Henri Sebastian (in the movie The Way) says is a place of much significance. For those of you who watched the movie or especially those of you who have actually walked the Camino you know cruz de ferro is a small iron cross at the top of tall wooden pole with a bunch of pebbles at the base. The idea is that pilgrims carry a stone from there starting location and then deposit it along with a prayer. The location happens to also be almost the highest point along the entire route.

It all looks very quaint in the movie but looking at that location via my “virtual” walk (i.e. looking at Google Maps, satellite views and the geotagged photos Google shows; you can search for ‘cruz de ferro’ and see what I’m talking about, I don’t reproduce photos from online sources due to implied copyright) it’s not quite the same as the image of the movie. The site is near a major road and is surrounded by parking lots and picnic areas. The cross itself is unimpressive so only interesting due to its historical perspective. Plus visitors leave a lot of mess at the site so again it’s not so quaint.

Also in the movie a collection of rustic signposts is shown. It turns out that’s just a short distance from the cross in the town of Manjarín (you can search for this to see). It appears to be part of a somewhat bizarre albergue/bar near all those signs, the Manjarín Encomienda Templaria.  That too is a bit less quaint than the movie made it look. So much for fiction.

And this raises an interesting point that I couple with other observations. A “virtual” walk certainly isn’t the same as a real one, but I’ve “seen” enough to get a much better understanding of what the Camino is like. And, frankly, a lot of it isn’t that great. The people who have the spiritual connection to the route don’t care, but for merely a “tourist” who’d like a more physical experience than riding tour buses I now question whether I’d really want to ever walk the Camino.

Or at least the classic (aka French) route. So now I’ve begun to focus on Camino del Norte route. What is still appealing to me is visiting the northern (Atlantic) coast of Spain, from France to Galacia. The country looks prettier (certainly greener) and I think the food would be better. Since my wife doesn’t want to do the walking as a compromise we’ll do part tourist stuff (driving, hitting hot spots like Bilboa) and then some more rural touring in the vicinity of the Camino del Norte and thus have some of the same experience.

But that’s in the future.  Now as to the digressions that are bogging me down.

My original idea was that I could merely focus on a mechanical aid to “translate” the written menus without actually learning Spanish. It’s not that I didn’t want to learn Spanish, I just saw that as too difficult. My sister (RIP) disagreed with my idea and said I should learn the language. So as I recently posted I’ve started to do that since I suspect some conversation with camareros  (waiters) would be required.

But I’m not going to fill this blog with many comments about my efforts. Any reader interested in that language has a lot better resources than I can provide. And my personal issues with it are mostly a digression so I don’t want to fill this blog with my adventures. But I’ll mention a bit.

As I previously posted I found what first appeared to be a good resource for learning a bit of conversational Spanish, which I do think I’d need to be able to order in restaurants. So I’m doing the Duolingo online study and have had decent results, thus far (up to about 600 words now, still struggling with verbs, of course). But as useful as Duolingo is I find that I fairly quickly master their “skills” (aka lessons) but then almost as fast forget most of what I learned. Without repeating some of the vocabulary (or having some other way to practice) I forget.

So, naturally, given an entire lifetime of developing software I began to think about building my own drills. I’ve done this before, several times in fact. Basically I’ve built software “flash cards” but with “intelligent” repetition, where I’ve developed some, not so good, algorithms to maximize drill on the vocabulary (or to some degree grammar) on what I’m not getting. Now learning vocabulary and grammar are helpful but speaking, and worse, hearing Spanish is tough. Duolingo helps a bit for hearing, but Spanish is a language my ear/brain simply don’t get. First of all, most Spanish speakers speak really quickly (this, I’ve found from online sources, is well known in comparison to other languages). And even with Duolingo, the full speed recorded sentences that I have to either translate or simply write what I hear, I miss lots of little bits. I have a terrible time hearing the gender or verb tenses which can be critical. I figure I can botch my pronunciation, as well as gender or conjugation, and probably still be understood, but hearing any response is really going to be tough. But the better I know the vocabulary, without a big mental delay to translate in my head, the more likely I can understand the spoken part. Fortunately there are many Spanish language TV channels in my cable subscription, often with good subtitling, so I have some opportunity, beyond Duolingo, to “practice” hearing, which will be more important to me than actually speaking well.

So, of course I started working on my own software to supplement Duolingo. That does have advantages over just using online courses. To write software one really has to understand some of the structure of the language (“teaching” something to a computer is a good way to find out what I do and don’t understand). So, for instance, I just finished, after considerable study and coding, how to do all the conjugations of regular verbs. And I’ve extracted all the vocabulary I’m learning in Duolingo to put into drills as well. So, IOW, I’ve switched from learning about menus to learning the language to writing code to help me learn the language. Hence, the “digressions” that have diverted my time from my original goal.

But I’m beginning to see the light at the end of that tunnel (plus my coding skills were rusty, so doing my menu translation app will now be a bit easier) and maybe I can get back to my original plan and more, hopefully, interesting posts about menus, instead of my experience with learning Spanish or writing programs.

So stay tuned when I get back on track.

 

Quiero hablar más español

It’s been quite a while since my last post. In addition to all the activities of the holidays I have continued, sporadically, to work on my project that is one of the subjects of this blog. So now I can report some progress.

As a reminder I am (slowly) working my way to develop a mobile application to translate restaurant menus in Spain. To accomplish this I am finding many menus from restaurants in Spain (only Spain to avoid Spanish terms from other Spanish-speaking lands). I translate these using machine translation (mostly Google Translate), then looking for discrepancies in that translation method and using either online dictionaries or Google searches to make better “guesses” about translation. Often terms on menus are not translated accurately (or at all) by machine translation

Once I have accumulated enough raw data (a never ending process) I can create a corpus with Spanish terms and the best English translation I can produce with a “confidence” factor (expressed as a probability). Once the corpus is large enough I’ll write code to extract the best food related (and a few other terms) vocabulary with the highest confidence levels of the accuracy of the translation. Once the vocabulary is “complete” (again a never ending process) I can build my application and then test it on all the menus I’ve accumulated. I’ll judge how well I’ve done this by expecting my translation tool to work much better than other machine translations.

Fine, a useful exercise as someday I hope to actually need to do this while touring Spain, an indefinite “wish” for me. Being able to accurately translate menus, as well as having knowledge of Spain’s cuisine I’d be able to wisely select my choices.

But, my sister, who was quite dedicated to mastering Spanish, albeit focused more on Mexican cuisine, was critical of my approach. Instead of just building an application her strong suggestion was merely that I should just become fluent in Spanish. A fine idea, but one I find very challenging.

Several times in my past I’ve attempted (not very vigorously) to learn Spanish. Since I lived much of my life in California some fluency in Spanish is almost a necessity. I first tried, decades ago, using the best technology then available, i.e. cassette tapes and accompanying text. Ugh. That was a bust. Later as computer tutorials became more common I also tried those, initially using DVDs (as the sound source, later just online voice recordings). These attempts all failed for me.

Why? For one thing I’m not very good at foreign languages. While I studied both French and German in several years of school classes I never got very far with those. My first trip to Germany was a joke at how badly I could either speak or hear. My only real exposure to having to use French was in Québec, during the time when speaking French was a strong “political” issue. I had a bit more success with that partly because everyone, e.g. waiters in restaurants, insisted on French. My stumbling attempts were at least considered a sufficiently sensitive effort that I had some success.

But with Spanish I have a different problem. The sounds of the language are much more alien to my ear – I really can’t hear the words, especially since, it seems to me, native speakers speak very fast and to my ear the words are run together. And, my attempts at speaking were even worse than my attempts to hear and understand. So this has been very discouraging and so I rejected my sister’s urging to just actually learn the language. Additionally I had the joke running through my head that her years of vigorous effort were analyzed by several other people that she had atrocious pronunciation, barely intelligible to a native Spanish speaker. If she couldn’t do it how could I possibly succeed.

BUT, in my effort to translate menus I’ve also found a serious stumbling block. Even with English menus often I need to have some conservation with the server to really understand the menu. And as I translated more and more menus I found this was even more true in Spain. Certainly discussing food with a knowledgeable server adds to the enjoyment of food (another lesson I learned from my sister who was more skilled at cooking than me and through example demonstrated how dining was more pleasant after discussing menu items in some detail).

So I happened to stumble on a new possible learning method. Just happening on an article on the Net about the best apps for “your new smartphone” (naturally timed with the assumption of Christmas gifts) I discovered Duolingo. Previously I’d done the demos with several of the subscription or purchased online tools with little success. But at least: a) Duolingo was free, and, b) it was available for my phone and so I could do the exercises at any time, not just during some study time while on my computer.

So I downloaded the app (both to phone and multiple computers) and committed myself to really giving an earnest effort to learn, at least some basic Spanish. Now, as best I know, traveling in Spain in the larger cities, especially those popular with tourists, probably doesn’t require speaking or hearing Spanish. When i visited Portugal I knew zero Portuguese but managed to get by OK (with some help from hotel staff making phone calls for me). And I managed to get by in both Japan and China, although with considerable help from the people I was visiting.

But my interest in visiting Spain is out in the countryside, initially focusing on the Camino de Santiago (the French route). Now I’m looking more at the Del Norte route since that part of Spain is more appealing to me that the dull plodding through country that looks a bit too much like the Great Plains or Central Valley of California. In such areas I would expect that at least some minimal conversational skill would be necessary. My hope would be: a) I could ask Spanish speakers to speak more slowly and thus hear each word, and, b) that my poor pronunciation wouldn’t prevent them from (mostly) understanding me.

So I’ve now worked as hard as I can on Duolingo. I strongly recommend this for anyone following my blog who might have the same need, especially as it is free (gracias to the community who create these lessons). I’ve made it through 12 days and 12 of the lessons. Duolingo requires a LOT of repetition and thus this forces me to work hard enough at estudio that I actually have made some progress.  Even the sentence I used as the title of this post would have been impossible for me prior to Duolingo.

In the first part of each exercise Duolingo introduces one to vocabulary (and without the more academic approach to grammar, i.e. simple conjugation of verbs). Then the exercises move more and more to responding to spoken phrases or sentences by: a) writing what was said in English, and, b) much harder, writing what was said in Spanish. Each exercise gets steadily harder making it difficult to “guess” and thus requiring actually learning something, especially when one has to actually type the Spanish (from an utterance), especially being picking about getting gender and verb conjugation right. The sheer repetition is working for me.

Despite my best progress ever attempting to learn Spanish I: a) still find it difícil to “hear” the utterance spoken at full speed.  I often either cannot hear the spaces between words or miss subtle bits (I really have trouble hearing una vs un). But since I must get every drill question right before I can proceed I muddle through. So thus far Duolingo reports I’ve now encountered 308 words (many useless for my purpose, also they count each version of a verb as a separate word). Thus far, as far as verbs go I’m still only in the present tense and with the singular persons (figuring out at usted is third person like él or ella was fun since Duolingo mostly uses the informal second person tú  as ‘you’, which often would be rude for me to use in conversation).

While Duolingo focuses on conversation instead of the typical more “academic” language study (all the grammar details, especially conjugations) I’ve done more exploration with other tools (especially spanishdict.com and Wikipedia) to go beyond the Duolingo simple lessons. I’m accumulating some of my own “lessons” to supplement the Duolingo lessons.

Now another challenge for me is that I’ve also learned, in past language learning efforts, that I’m fairly good at immediate duration memory. So while I’m intensely involved I learn to recognize many words. Unfortunately weeks later I’ve forgotten most of those. So, with Duolingo I actually repeat finished exercises to continue repetition which is key.

BUT, repeating everything is time-consuming and not that helpful. The real repetition I need to do is the vocabulary (or sometimes grammar) that I do badly. So now I’m thinking about another bit of programming for my own learning tool.

Once before I built a fairly complex bit of code to extend my English vocabulary. Using something built into Kindle I would mark English words that I either didn’t know at all (like reading more “academic” texts that use more esoteric vocabulary) or that I wasn’t really sure about. Kindle had a drill application that accumulated the words I’d mark as I encountered them in some book. But the Kindle drill, like Duolingo, wasn’t very “smart” about focusing my drill time on the words that gave me the most trouble. So in my own app I developed a scoring system that adjusted my drill to the words I most often missed and also then made sure all but the easiest (for me) words were at least repeated some. I spent a lot of time tuning how that algorithm worked but never was completely satisfied with it.

So with Duolingo as a model (incomplete for what I need) and all my past efforts at learning languages I soon will begin to build my study app (a fancy version of the classic flashcards, especially for verbs and gender). I can move all my Duolingo vocabulary to that app, plus much of what I’ve accumulated from menu study, plus just grabbing more words not found in either source from either: a) various lists I’ve found of the “most common” Spanish words, or, b) from going through a couple of dictionaries, tourist phrase books and grammar books I’ve purchased for my Kindle.

Eventually I would expect my drill app to be sufficient to potentially get by in parts of Spain where I might not find any English speakers. One thing I have learned from my foreign travel is that travel itself (public transportation, getting directions) often requires speaking to people who don’t know English (say, unlike typical tourist destinations, i.e. city hotels, museums and restaurants).

But all this is just a start. I know, largely from my experience in Québec that “immersion” is the real way to learn a language. To be someplace where there is no English mandates that I at least stumble through some sort of conversation to get what I need. Mi esposa loved her weeks in Oaxaca and wants to go back (which I’ve resisted) so perhaps I’ll give in and make the trip she wants as preparation for Spain (just as Québec can be a shorter preparation trip for going to France).

So, I won’t belabor this point much more in posts since I’ve focused this blog on food in Spain and the Camino. My efforts to learn a language are probably even more boring to my readers. But I will supplement some of my posts purely about food terms with a bit more of the conversational stuff I pick up through this other study.

 

 

A few words from Astorga

As I mentioned in my previous post I lost a long and heavily researched post about something unusual I found in the area around Astorga, or more properly La Maragatería (link is to Spanish language site) is a Spanish region located in the central area of the province of León. It seems there is a local meal (multiple courses) that many restaurants promote, the cocina maragato (link is to Spanish language site). While I won’t try to reconstruct my entire post about this I will recover a few things.

cocina maragato is a meal of multiple courses: meat, vegetables/legumes and soup but it has the unusual feature of being eaten in reverse of the normal order. This idea is summed up in this explanation from one of the restaurants serving this meal (original Spanish from website on the left, Google Translation on right, items of interest in bold).

El cocido maragato tiene la peculiaridad de comerse al revés.  

Primero las carnes, luego los garbanzos y verduras y por último, la sopa.

Estos tres servicios se denominan, en la zona, “vuelcos”. 

The cooked maragato has the peculiarity of eating upside down.

First the meats, then the chickpeas and vegetables and finally, the soup.

These three services are called “rollovers” in the area.

First, al revés can be translated as ‘upside down’ but it has multiple translations according to its Wiktionary entry:  as an adjective, ‘inverted (with respect to something)’ and as an adverb, ‘1) in the opposite direction or order, 2) back first, 3) inside out, 4) the other way around, and, 5) upside down’. revés alone has various definitions:  back, wrong side, other side, inside. Now given the numerous explanations of cocina maragato it’s clear the most useful translation, in this context is the ‘in the opposite direction or order’ since that is the key feature of this meal.

Second, my first encounter with cocina maragato was seeing vuelcos on multiple menus, such as: Primer vuelcoSegundo vuelco and Tercer vuelco. These are headings in the menu where one would normally see platos (in typical menu context, ‘courses’). So the translation as ‘rollover’ was definitely mysterious until one understands the cocina maragato context.

Still none of the online dictionaries have caught up to this meaning of vuelcos as they go with the simpler and more literal translations: ‘upset’, ‘spill’ or ‘complete change’. Interestingly Google’s choice of ‘rollover’ doesn’t appear but it is found in a reverse lookup of the English ‘rollover’ in spanishdict.com. So it’s hard to think of a single word translation that would imply the correct meaning in this context, but ‘inversion’ (or clumsy, ‘reversed course’) would probably be closer than ‘rollover’.

And since I can’t recreate the entire post I’ll just cover one other word that I’ve often seen on menus in the context that this restaurant uses:

El Cocido Coscolo no es sino un agradecido heredero del cocido tradicional maragato.

Sobre esa base, hemos introducido algunos cambios que hacen de nuestra propuesta algo diferente.

Su principal valor es la elaboración propia de los ingredientes.

The Cocido Coscolo is nothing but a grateful heir to the traditional cocido maragato.

On that basis, we have introduced some changes that make our proposal somewhat different.

Its main value is the preparation of the ingredients.

This restaurant, Restaurante Coscolo, is explaining how its eponymously named Cocido Coscolo is its version of cocido maragato. The word I’m focusing on is propuesta which I’ve seen on numerous menus and Google has always translated it as ‘proposal’.  In other encounters with this word ‘proposal’ made some sense in the context it is used but never seemed quite right, to me, as the translation. Dictionaries go further with these literal translations: ‘offer’ or ‘design’, or ‘nomination’ (in the sense of proposal of a candidate).  While the ‘nomination’, according to the dictionary would clearly apply to a person for a job/office, not a particular preparation, it nonetheless fits. Along with the idea of ‘offer’ or ‘design’. I’d use the translation, ‘take’, as in the phrase “our take on the traditional xxx is yyy”.

But this would be clumsy to have to explain this rather than just use ‘take’ as translation (which certainly wouldn’t fit other contexts). And this will be a challenge for the app I intend to build and suggests that one feature of the app has to be the ability to put a single word in the translation but touching that word would bring up a popup dialogue with a longer explanation to provide the context that is too verbose to include directly in the translation. So analyzing this menu and word provides insight to a UI feature I should include.

I had more about this whole area and other restaurants and this meal but this is all I can reconstruct in a reasonable time. You’ll just have to do your own research if you want more.