Next virtual trek – my plan didn’t work out

I know this sequence of posts is way off the primary topic of this blog but this will be the last one (on this topic, at least for a while).

When I last left you hanging I described the method I was going to use to acquire an accurate table of distances, fairly closely space (e.g. 3-6km) along the Via Podiensis so I could spend the next year or so on treadmill piling up miles to then “take” a virtual trek. My plan was to use a couple of GPS tracks I found online to get an accurate distance along the entire trail and then pick intermediate spots for my table and know their distances.

Since the software I have on my PC only covers the USA my only available tool (at least in initial plan) was Google Maps (or later tried Google Earth which has more features).  I quickly learned two things: 1) the high resolutions (4000 waypoints) GPS track was very tedious to enter (all manually) into Google Directions which has a limit of 10 points along a route and thus I was getting less than 1km of trail for 5 minutes or so of work, 2) every now and then, but in minor ways Google didn’t want to generate precisely the same route as I could see on the map where I could display the entire track (but not get any distances).

So I switched to the lower resolution track (only 500 points, visually on the maps it’s a bunch of line segments that don’t precisely follow the road/street/path/trail). But I figured I could find the flaws in that and patch in bits of the high resolution data.

Now in some ways I’m really being OCDish about this. What difference does it make to be highly accurate. Well, consider this, a real walk has to go where the path goes, not in straight lines across country or through someone’s house or yard. And most of the backroads where the Camino goes are not straight super highways but meandering paths. Now if you’ve ever hiked in the real world you know your actual path can be a lot longer than just a compass line on a maps. All those zigs and zags add up. The small set of straight line segments would probably be off, in total distance, by hundreds of kilometers. IOW, not much use for accurately converting treadmill miles to a location on the ground in France.

But not to worry, Google knows this and so it actually follows the road between two points on the road. And while it does a bit of rounding in the distance that’s still going to be fairly accurate.

So other than being a tedious process my preliminary results showed, at the cost of more time than I’d hoped, I could get a fairly accurate route.

WRONG!

I was manually entered a set of points, having worked out a record keeping procedure for doing all this and everything was fine and, then, the next point, probably only 50m from the previous with a road showing in map mode and even clearer in satellite photo mode and Google routes this round-about path, about a kilometer that was essentially a giant U-turn to reach that point from the other direction!

No sometimes, at least here doing geodashing in the midwest, that’s exactly what one has to do. Yes there is a road on the map and yes you can see it in the satellite photos and NO you can’t go that way because there is a gate or a damaged bridge or whatever. But presumably the GPS track I’m using means that person who recorded the track DID go that way so it’s possible.

After more experimenting I eventually discovered that what I’m seeing is gaps in the Google underlying database, i.e. some abstracted mathematical description of all the possible roads/paths/trails they know. And in that database you can’t get from point A to point B, at least not just going forward.

So after reading manuals and searching online I eventually discovered (I think) there is no way to solve this. So electronic mapping systems let you manually enter “vias”, i.e. some line segment that connects two bits of road together. That software is letting you use your knowledge (you can go that way) to override their database that can’t allow you to go that way.

But Google isn’t designed for complex routing issues. It’s designed for ordinary users to do simple things and thus doesn’t clutter up its UI with all sorts of advanced features. I encountered this with my standard USA mapping application (now defunct as the company was bought out and their products dropped; I won’t mention the name). That program was for “pros”, people who had complex navigation problems. For a while it was the only car-based solution but gradually the dashboard GPS came out and also, of course, Google Maps on smartphones. Those solutions are generally much easier to use, but they are “dumbed-down” relative to people with complex navigation requirements, which of course is a very tiny fraction of the market that they can afford to ignore.

So after searching for other solutions (there are a few other online mapping systems, but most have even less data than Google) it appears, like my route on the map, I just can’t get there.

As someone so often says, “SAD”.

So that means I have to use the one other data source I have which has two problems: 1) the distances between the 34 overnight stops are rounded off and add up to about 50km less than the known distance of the route (which, often, there are multiple answers to that to be found, but all the distances are greater), and, 2) there are just the 34 waypoints which will takes weeks for me to reach each (yes, the trekkers do them in a day, but I couldn’t imagine doing 20 miles / 6 hours on the treadmill in a day).

Plus my purpose in all this is a “virtual” trek. I did learn that Google has lots of detailed data at short distance intervals, restaurants, hotels, gîtes (the French equivalent of alburgues) and other points of interest. So I need all that detail to “see” what the trek would look like. It turns out that only doing relatively short daily distances on treadmill allowed me to follow (where available) the entire streetview (so literally walk into a town and look around). I have lots of experience looking at satellite photos (though mostly in plains and midwest US which doesn’t look much like France, or even Spain) but online satphotos aren’t the high resolution spy photos so often you can’t “see” very much. And looking at the roof of a house or building is much less interesting than looking at it at ground level.

So while I can use the table I did find, just for statistical purposes, I’m going to have to really guess (from zooming in on GPS track displayed in Google Earth, unless I can figure out how to load KML files into Google Maps) where I am. It’s not going to be pretty and that’s a bummer that make take too much “fun” out of my virtual trek to just bother.

At least one thing, though, is I can take a look at some French restaurants and while I’m not interesting in trying to build a translation app for that at least I can see lots of pretty pictures of food (already seen some, first course in France seems to routinely be pâté not cured meats as in Spain).

So with all this discussion out of the way I can get back to my regular topic, menus in Spain, since Santiago has a ton of restaurants, some with online menus I can decode.

Advertisements

Next virtual trek

I mentioned in yesterday’s post that I had completed my virtual trek of the Camino de Santiago. That is, I take mileage I accumulate on my treadmill in the basement and convert it to locations along the Camino. Google Maps and Streetviews then provide a good “look” at the route.

Why do I do this? First, I want to actually learn as much as I can about walking the Camino and my relatively low daily distances on the treadmill are easy to follow on Google Maps, also allowing me to find restaurants and albergues along the Camino and study their photos and menus to learn more about food, or generally something about what walking the Camino would be like. Second, using a treadmill is boring so I need some sort of incentive – knowing I’m just a short distance, along the route of my virtual trek, to a particular POI (Point of Interest) on a map gives me motivation to do a bit more on the treadmill.

So now that I’ve “finished” the Camino what do I do?

Now I put “finished” in quotes because the data I have for the Camino’s route (and thus distances along the route) is somewhat uncertain. I found a Google Earth GPS track of the Camino and used that for while, but whoever set that up didn’t renew their Google license (for embedded maps in webpages) and it failed. So I found another route. And guess what, they’re not the same.

There’s an old joke that a man who has just one watch “knows” what time it is, but a man with two watches isn’t sure, i.e. different sources of data almost always disagree. Also, until my latest exercise I didn’t try to get distances along the Camino directly from the GPS data but instead from a table I found on the Net. I did enough analysis to confirm that table seemed relatively accurate and so used that data to declare I had “finished” the Camino.

But two new items for me. While I had learned that “Camino” itself is a vague term (there are many routes of the Camino) I didn’t realize that the Camino Frances (the most popular route) doesn’t actually start in Saint-Jean-Pied-de-Port; that’s just the most popular starting point resulting in about an 800km walk. Instead that particular Camino really can start various places in France, but most commonly in Le Puy-en-Velay France (and then that segment goes by the name, Via Podiensis). Adding that segment (and also going past Santiago to Fisterra) turns the walk into a 1000 mile trek, not just the <500 miles of the conventional route.

So now I have an obvious extension to the Camino to use as my new virtual trek, the entire 1000 mile distance which will give me something to do on my treadmill for another year. So that gives me a new project, figure out the distances along the Via Podiensis. Right away (and I’ll describe this in more detail in a followon post) I found several GPS tracks but all of those have some “issues” as to figuring out distances and milestone waypoints. I also found, at a website that does escorted walks, a table of distances between the 34 overnight stops they make. But that route is: a) not exactly the detailed route of the Via Podiensis, and, b) the distances are round numbers whose sum of all the segments is about 80km less than various sources claim is the total distance.

Now people actually walking the Via Podiensis could care less about all this; they’ll find the route (possibly with some misdirection) and get to their destination. But I need as accurate as I can create route and table of distances to do my conversion from miles on the treadmill to locations in France.  And so that’s what I’m working on now and will report in a short while.

Fortunately I have plotted about the first 40km and as I’m now only (on cumulative treadmill distances) about 2km past Santiago I can restart my virtual trek for at least a couple of weeks while I figure the rest out from the multiple sources I have (and perhaps even more I might find).

Now how do I do this?

I have a long history with GPS and GPS tracks and I’ll bore you, Dear Reader (and record for myself) some details.

I first learned about GPS when I was working at a small startup in Silicon Valley and one of the engineers was recruited to go work at a new startup, Trimble. I’d never heard of this (or GPS) but learned an ex-HPer, named Trimble, had started the company and was recruiting colleagues he’d known at HP (now in the diaspora of former HP employees populating all the other startups). At that time GPS was a military technology and had a hugely expensive system (in nuclear submarines) but Trimble believed this could be re-engineered for a consumer (albeit only professionals) technology. Later, in another company I used to ride my bike to work and I often noticed people with huge backpacks and an attached 6′ long stick with electronics  on top. I didn’t know it at the time but these engineers were testing the early Trimble prototypes.

So fast forward about a decade and when I first moved to Nebraska I was going crazy in the winters (having been spoiled by California) and so just set out driving south, eventually ending up in Big Bend National Park. Driving solo and trying to read a paper map was nearly impossible so I was in the market for a better alternative. A bit of research revealed that GPSr (the ‘r’ is for ‘receiver’) had truly been reduced to consumer (affordable) level and so I bought my first laptop and the DeLorme GPSr and its software. The world of automated navigation was opened to me.

While the laptop worked fine in the car (I had to also discover “inverters”, then uncommon to power the laptop) but was useless for walking. That led me to discover handheld GPSr’s, in particular the early Garmin eTrek models which I bought at the original Cabellas (in Sydney Nebraska) and used for the first time hiking in the Bighorn Mountains in Wyoming, learning an important first lesson, use the GPSr to record the location of your car so you can get back to it.

All this led me to the world of geodashing, one of the various geo-xxx “sports” in the earliest days of consumer GPS where they were still rare and so enthusiasts would find a way to make a game of using a GPS. Over time I learned more about mapping and especially the early satphotos to use to study a place one might go, where despite roads being shown on the electronic maps (the data was crummy back then) might not really exist. Over the years I got better and better at using these tools, which eventually led me to my first “virtual” trek.

Now raw GPS tracks are usually pretty messy data. For instance, here’s a set of tracks, made over multiple days (since time affects GPS accuracy) of a corner near my house.

or even this set of tracks including the driveway of my house (the red lines are actual paths of the streets as taken from a surveyed map) – note all the scatter in the data, this will come up as an issue in my next post.

Each GPS has various options for recording data and as you can see in this image (I recorded the maximum data) there is a lot of variability. IOW, early on, with my own experiments I came to look at GPS tracks with a bit of skepticism. So tracks I found on the Net I know are not quite right.

So with all this practice and knowledge I set out to create my first virtual trek, the Pacific Crest Trail (which, btw, I did “finish”, as in do the necessary distance on my treadmill). This was years ago and I don’t remember the details but I remember writing my own code to convert the KML (Google Earth) file I’d found into Delorme “route” info. I quickly learned that Delorme couldn’t handle the entire PCT as a single “route” so I had to break it in pieces.

BUT, the key thing was Delorme could convert the waypoints (fortunately closely spaced) to distances. Given the PCT doesn’t follow any “roads” the routing within Delorme itself was useless, but I found a way to get distances from the GPS track and from that I could then convert my cumulative treadmill distances to location. Of course I used Google Earth to “view” the PCT, but: a) at that time Google hadn’t done Streetview yet, and, b) the PCT is a wilderness trail that doesn’t follow any “roads” in the Delorme database. But Delorme was designed to use (the Topo) version for people doing outdoor recreations and thus was happy to have routes that didn’t follow any known paths in their database and still get distances.

So all of this led to where I am now. I hoped to repeat the process but knowing: a) there is a lot more and newer information, mostly from Google, and, b) Delorme only has detailed maps for the USA. So now I had to find a new way to replicate the process I used for the PCT and apply it to the Via Podiensis.

And I’ll end this post with this, to be continued with the explanation of the process I am discovering (still having to experiment some) for Via Podiensis which eventually means I’ll have what I need: a fairly precise table of distances (at roughly 10km intervals) that actually follows the roads, paths and even off-road trails (not known to Google, but I can guess some). It’s a tedious process but for me, with my weird obsessions, an interesting exercise in itself with the ultimate outcome (still a hope but fairly sure I can do it) to create what I need for another ~750km of virtual trek.

 

Santiago is only the destination

I’ve been busy, mostly doing lots of actual learning of Spanish (instead of my original goal of just translating menus in Spain) but I’ve kept up my exercise, both bicycle and treadmill. I translate my distance on the treadmill to distance along the Camino de Santiago. Since I only do a short mileage per day I can follow, in detail, on Google Maps the route. I have a list of distances along the Camino (presumably correct, but after all I found it on the Net which makes it a bit suspect for accuracy) and so now I can announce that I’ve created the trip from Saint-Jean-Pied-de-Port to Santiago de Compostela, 494.86 miles, just under 800km (the road sign at the start of the movie The Way showed 800km).

Now a “virtual” trek along the Camino may sound silly, but here’s my point: 1) I’m not in Spain so I can’t actually do the Camino, 2) I need a reason to pound out exercise miles in my basement (with the “hope” that being in shape means I could do a real walk) and converting miles to locations along the Camino provides an incentive, and, 3) if nothing else I can at least see what I might encounter along the way, as poor a substitute that satellite photos, human geotagged photos and Google StreetView might provide. But as Joost says, “a man can dream!”.

So now I’ve “seen” everything along the Camino, or did I? Like most people I thought the Camino was just that 800km from French border to NW Spain. Yes, I learned there are numerous Camino routes. When Spain was under Moorish conquest the route became the Camino del Norte, a more rugged (and frankly more interesting,to me) route. But why is the route just Spain? Sure in theory it’s to reach St. James but there are lots of routes pilgrims can take.

Now just a bit more on my stats. As the movie says “I started my pilgrimage” on 22Nov2017 (rather that’s when I started with my current file of records, I’d actually done 42.2 miles before then). And you might say, awfully slow there old chap. Yep, my average daily distance is a tiny fraction of what a real trek requires. All I can say in defense is that I’ve also done 10435.9 miles on my stationary bike at the same time, a bit more impressive 25.3 miles a day since starting my virtual Camino. IOW, I could have done the Camino about 20 times (or 10 there and back) on my back in the same time it took me to “walk” it.

But why did I label my post as I did?

It turns out I’ve been reading a fun little Kindle book “The Journey in Between” by Keith Foskett. Of the various stories (movies, documentaries) I’ve seen about the Camino this one was interesting. It’s the personal story of a young Brit who since a young age just loves walking. The Camino had none of the usual interest to him, just a good walking route. And as I’ve now learned he started another 740 km (various measures of the distance exist) Before St. Jean-Pied-de-Port in Le Puy France. He has an interesting story of his personal journey so I’ll let you read that for yourself, but I just want to include two tidbits:

relative to the idea that “Santiago is only the destination”

There is no defining event, no sudden enlightenment. I needed to live in the moment, enjoy the journey.

and

The text summarized my journey. My mindset at the beginning was simple: El Camino had a start and an end. Begin at Le Puy en Velay, finish in Santiago, and complete the challenge. But I realized that the answers lay between those points; neither end mattered.

I’ve wondered why I’ve become so fascinated with the Camino and probably until I try to do it I won’t know. But like Keith (Fozzie) I too have liked to walk my entire life. I grew up in Montana where at least at a kid level I could take long walks. In high school I read of Hemingway talking about hiking near Red Lodge Montana and wanted to go (my parents were not so accommodating on my impulses). I climbed Mt. Washington in bad weather once I started college in Boston. I did my first backpacking trip in the middle of a hurricane. I gradually got better equipment and more skill and have tested myself against the Sierra, the Cascades and the Rockies.

Backpacking or even just wilderness hiking is way different than the Camino. But both emphasize self-sufficiency and rising to cope with whatever comes your way. The Camino (or other long walks in Europe) are oriented to frequent stops in towns and lots of encounters with people whereas the long walks in the USA (Pacific Crest Trail, Appalachian Trail – I’ve done segments of each) are more remote, with fewer creature comforts. Albergues in Spain (or Gîtes in France as I’ve just learned from Fozzie’s book) may be a bit rough but it’s not quite the same as really sleeping on the ground.

But what is it about walking? Sure, lots of people do the Camino for religious or spiritual reasons, the original reason. But today most do it for some other purpose. It’s not as crazy as the mobs on Everest now with a few dying due to overcrowding, but somehow we humans like to get out and move around, and push ourselves into more difficult efforts than we thought we could do. But again, why?

I think the real think about walking, even the short hikes I do on a couple of local trails is just our sense of time and space and, most important, of ourselves changes from the life we normally lead. A good walk may take 6 hours in a distance a car moves in 15 minutes, but how different is the experience. Humans evolved to react to our environment at the pace of walking, not cars or planes, or even bicycles. Somehow the rhythmic thing of one foot in front of another changes us.

And time changes. The constant hurry of our normal world is now replaced by a loss of sense of time. Time is measured by when scenery changes, when someone else is on the path, approaching in the distance, getting larger and larger, then saying hello, and then gone, all in more time than the typical business meeting. Time is when I reach the bend I can see ahead. And time is a lot, hours of walking, more of a single thing than we normally do. And mostly solitary. Even if walking with companions talking is only some of the time. We spend more time with just ourselves than we do in any other event, except perhaps sleeping.

And then somehow physical exertion, being very aware of our bodies (especially aches and pains), the slow passing of time, the building of fatigue (or hunger or thirst or needing to pee) just become our focus. The other stuff falls away.

So the most meaningless part of my “virtual” Camino is not disconnecting with normal life and connecting with life on the road. I’ve known this well enough, in my multi-day backpacks and bikepacks, to understand what it means. And somehow it is compelling.

The Camino, for me, is not as enticing as it was before I did my “virtual” version of it. I’ve looked at enough of the path (often a gravel path right next to a highway with lots of traffic and no shade,  I have a place nearby, called the Cowboy Trail, that can provide that) to reduce the glamour. I’ve read enough accounts, books and online diaries, to see some of the bad, or just the mundane, along with the good. My illusions are less, my enthusiasm is less.

But the wanderlust is still there. One point of the Camino, for someone who does just want to take a long walk, is all the accommodations for pilgrims. Being able to stop at night, find food along the way, etc. fits my age better than my backpacking days. When I did my first bike camping trip, along the California coast, I quickly saw some advantages over backpacking. I had to stay at campgrounds (not just on any piece of ground) and those are near towns. So forget lugging food. Unpack the gear from the bike, set up the tent, and head to town, not just for heavy (none of the freeze dried nearly inedible stuff) food but also a nice bottle of wine, unthinkable to carry on a multiday backpack. So the idea of carrying even less, as in trekking on the Camino, sounds pretty good. Sleeping in a bad bunk bed in a dormitory, not so much.

So I still haven’t found my dream (and also, at this point in my life, “bucket list”) walk, but I’ll keep looking. The people who do the Camino have a letdown when they’re done, often finding an excuse to go further (or perhaps reverse course and go back where they started). Because the destination is not a place, it’s a state of mind, and it’s not a time, it’s forever. The geodashing I do has a slogan “getting there is all the fun”. Anyone on the Camino would understand this.

estoy de vuelta de Oklahoma

I had some family business in Oklahoma and so planned some other sightseeing for about a week. But things didn’t work out. I read that this has been the most rain over twelve ever recorded for the USA. Certainly I can personally attest to that around here. So what is normally hot and dry and dusty area was a swamp. Two different Interstate highways were closed. But I did get to my farm and check out the grass (all it grows) and the wind turbine (which was really turning, but we didn’t get cancer or see any “carnage” of dead birds so I don’t know which wind turbine Trumpidot visited to see such things but in the real world it’s all lies). But that aside.

What does any mean for the primary topic of this blog. Well first I also visited my birth town of Amarillo Texas. Never when I lived there did I know: a) amarillo is Spanish for yellow, and, b) it’s not pronounced ama-rell-o. And the original state was really Tejas and later anglicized to Texas. And we visited Palo Duro Canyon which was just a name to me, as in ‘hard stick’ as Google translates and Wikipedia confirms but I see no connection with that place. But the point is a great deal of names for things in Texas are from Spanish which of course makes sense as Texas was part of Spain, then Mexico, longer than it has been part of the USA, something most Texans choose to ignore. So I grew up surrounded by Spanish but was hardly aware of it. Later I spent most of my professional life in California. By that time I knew the first town I lived in, Palo Alto, and the next, Los Altos, certainly were Spanish names and that the main street of both, El Camino Real, were Spanish. It was a long time after living there that I learned Los Gatos was named for the mountain lions.  So in some ways it’s remarkable to me that it’s taken me 71 years to really attempt to learn Spanish.

I planned to keep up my language study on Duolingo on the trip but I’d started a process of recording exactly what drills I did, first in an Excel spreadsheet and then in an app I wrote. Both were working fine when I transferred these over to the laptop we use exclusively for travel. Of course, when I needed them both failed. I had the source code of my app, but my development environment was expired and I’d lost all the passwords, then my Office 365, while not expired, demanded a login and I didn’t have that password. So with no way to record my study in Spanish I essentially stopped doing it. I did a few drills in French and German just to keep up my “streak”, but basically I lost about 6 days.

And I’m amazed at then how much I forgot. Certainly I remembered 98% but several words I’d used in drills just before leaving on the trip I could no longer recall.

I’ve been grappling with this for a while. Duolingo does a lot of repetition, especially if you do every drill in every exercise (instead of testing out) but then there isn’t a lot of repetition of previous material. I had already developed an app to counter this, something I could use daily to refresh my memory, but since it was just vocabulary drill (glorified flashcards) that wasn’t enough. So I wanted to repeat entire drills (usually 20 individual questions) to also deal with grammar, word order, gender and verb conjugation. But a lot of repetition of previously learned material cuts into learning new material. For me it’s not so much a question of time, which I have in sufficient quantity, as merely endurance, i.e. I can only take so much language study each day.

And in development my app to manage what Duolingo material I’d do I also did a simulation and my first results in that indicated it would take nearly two years to complete the full Duolingo “tree” (their entire course which puts you somewhere in the A2 (CERF) range). With the initial algorithm I had for also including repetition it shot out to more than three years to finish. Given I’d like to visit some Spanish speaking country sooner than that it means: a) I have to carefully ratio repetition, just enough to retain what I’ve learned, and, b) actually increase my daily effort. To stay on schedule, i.e. aim at the big picture each day requires more than casual attention so my app, with all the data recording, statistical analysis and prediction is necessary, at least for someone like me to make the right daily progress toward a long term goal.

So I’ll leave you with a trail photo. While this is up in the mountains of Wyoming we saw a lot of this on this trip, including some flooding over the roads too much to cross at all.

 

 

Beef by any name is ???

One of the fun things about trying to study menus in Spain is figuring out the correct terms for ‘beef’. Here is the USA, and especially in Nebraska, the second largest beef producing state in the USA (surprise, Texas is first, obviously, but what about Montana or Colorado?), it’s just beef (and if beef, as in a steak, is not explicitly stated it can be safely assumed).

Now the cuts of beef (or any meat) is yet another subject, most menus include ‘beef’, but what do they call it. It’s almost always “grilled” (various names for that) either on a hot iron cooking surface or over coals on a grate. IOW, it’s some kind of steak and as best I can tell, from looking at photos and reading descriptions, it’s more or less the generic “steak” (almost certainly beef in the USA). It’s hard to tell from the menu whether you’d get an old tough piece of cow (most likely) or something a little better. Of course in beef crazy parts of the USA there are lots of terms as well.

But is beef just beef and it doesn’t much matter, i.e. red meat cooked fairly rare. Now Spain certainly has an ample supply of lamb (lots of names for that) or pork (uncured, fairly simple, i.e. cerdo and cured, well, lots of names for that).  If you’re not avoiding red meat you’re fairly safe getting almost anything that is “grilled” (mistakenly often called barbecued in the USA, which is rarely the case, since real BBQ is something entirely different, both the meat itself and the method of cooking).

The most common term (from my non statistically significant analysis) is ternera , which most dictionaries would call ‘veal’. But this is not really veal as we’d think of it, especially relevant to Italian style veal preparations. In Spain this seems to just be, mostly, a young cow, not the anemic milk-fed very young calf you might think of as veal.

Now as an outsider (and not as a butcher or rancher) I believe ternera is just a young cow, not much different from feedlot beef in the USA. Any USA producer of beef faces the issue that at some point you’re spending more money to keep a cow alive than that cow is gaining in commercial meat, so most feedlot beef is actually fast growing young cows. It is more gourmet (and much more expensive) to have more mature, larger cows, especially “free range” (I’ve sometimes seen terms that imply this in Spain) or even more expensive “grass fed”. So my guess is that ternera is most restaurants is not much different than generic “beef” one would find in the USA.

Now terms for beef in Spanish are also complicated because some of the countries in Western Hemisphere, esp. Argentina, are big beef producing (and consuming) countries and so you may encounter terms for beef, in dictionaries or web searches, that would rarely apply in Spain. But here are a few I’ve managed to collect:

carne vacuna: beef
Ternera de leche: veal
Añojo or ternera: 1-2 years old
Novillo: 2-4 years old
Buey: castrated male over 4 years old
Vaca: female over 4 years old
Toro: uncastrated male over 4 years old

Now vaca is somewhat common (in my sample of menus in Spain) and is, by dictionary lookup, just ‘cow’, i.e. again beef.  buey is less common, but as per the definitions above that’s because it’s from an older animal and thus probably even more expensive, even though it’s also probably tougher (to a degree tender and tasty are conflicting terms when it comes to beef).

The other term one finds, not in the list above, is de res which seems difficult to define and also is less commonly used in Spain.

But one amusing difference in Spain than the USA is that rather old cows seem to be an especial treat (when done properly). Apparently Spain imports older cattle and fattens them up. When you see photos of the raw cut of meat the fat is thick and very yellow compared to the usual whiter fat. I suppose I could be sold on this as an interesting meal, but it doesn’t sound likely. So while chuleton is common (for the older cows) you also encounter what may be very specialized term of txuleton (the Basque equivalent and likely even less common except in northern Spain).

Now as to eating toro I’ll leave that to others. I suppose Spain has to do something with all those bulls killed in the ring but I can’t imagine this would be a top-notch culinary experience.

So back to ternera – why is that so common? I’ve seen two explanations: 1) younger cows are butchered to reduce the chance of having mad cow disease, plausible but the term itself is older than the concern over mad cow disease, and, 2) that raising cattle to older age isn’t very compatible with the agriculture in Spain, either as “free range” and/or “grass fed” which is an expensive (and land intensive) way to get good beef, so really the economics and process of raising cattle in Spain, somewhat like feedlots in USA, encourages early “harvest” of the animal to human food.

While a simple grilled steak may be a “safe” choice at a Spanish restaurant I wouldn’t expect that to be a very desirable selection. The roast lamb almost certainly seems more delectable.

Probably by any name (and cooking technique) the various terms for beef will put on your plate something you can eat as a good protein source (assuming you even can stand red meat, avoid any of these terms if you don’t like meat) and maybe sometimes it will be a tasty choice. Coming from a part of the USA (originally Texas, now #2 in beef Nebraska, famous for its steakhouses) I imagine I’d always find this edible (and some “beef” I had in Germany was dubious as edible) so probably it’s hard to tell from just the menu alone the quality of the beef you’ll be eating.

 

 

 

 

Last 100km + some menu translations

It’s been a while since I’ve made any posts related to the primary purpose of this blog, which is analyzing menus in Spain in order to construct a translation application.  So now I’ll do a quick return to that kind of post.

In order to explore restaurants in Spain (and as an incentive to keep churning out miles on my treadmill in the basement) I’m converting exercise miles into locations along the Camino de Santiago and today I’ve reached the very last place you can start a trek and still qualify (need at least 100km) for a Compostela which looks to me to imply starting the Portomarín, at least along the route of Camino Frances and that’s where I just arrived after my 436.1 miles of virtual trek. Actually I think this remaining distance is probably some of the better real trek even if it is only a few days.

And there, in this relatively small town I also found a good restaurant, in Portomarín to consider for understanding menus and then relating a couple of points to you, Dear Reader. So I have to honor copyright and not put other people’s pictures in my posts I strongly suggest you go to maps.google.com and use this search “O Mirador, Portomarín, Spain”. Not to be plugging this restaurant but there are over a thousand photos accessible through the Google Maps site and lots of pictures of zamburiñas which Google Translate doesn’t understand, despite these being very common and popular in Galicia as well as an icon of the entire Camino pilgrimage.

Now the main way I study menus is to extract them into some working documents I created and then get the Google Translation. Generally GT does fairly well but it also misses or botches some terms. That then sends me into my research, using various dictionaries and food sites and just plain old searches to get clues to figure out a better (as needed) translation of the menu items. So for instance, zamburiñas which Google Translate doesn’t know Google search can easily find and even reference a Wikipedia article for ‘variegated scallop’. First in my search results is an article in Spanish, Diferencias entre vieiras y zamburiñas, which is quite helpful.

When I started this project over a year ago I actually knew no Spanish. I ignored advice to actually learn Spanish since I was convinced I could succeed without doing that. But as I admitted in earlier posts I realized the advice was right and so I’ve actually been plowing through learning the language, so in fact, I could mostly translation this key sentence (from the article above): Las zamburiñas son de unas dimensiones más reducidas comparado con las vieiras. Which of course doesn’t mean much unless you know (in addition to the other words) that vieira is the conventional term of ‘scallop’, that is the typical standard size (and the source of the shells on all the peregrino’s packs or on the trail signs).  So in case you can’t read the sentence (even though it’s got a lot of cognates to English) it just means that zamburiñas are much smaller vieiras. What that doesn’t tell is that these are quite popular (and widely available) in Galician and the ones shown in the photos connected with O Mirador make it clear (and persuasively looking delicious as well).

Now let’s consider the restaurant’s name. One of the menu items, Parrillada O Mirador, which Google translates as ‘Grill O Lookout’ is the typical highly literal translation GT does, without paying any contextual attention to the discourse, i.e. O Mirador is the name of the restaurant and parrillada is a diminutive term you more frequently see, which is parrilla, which is one of several terms that gets loosely translated as ‘grilled’ (usually with a la preceding it). In contrast with a la plancha which is also usually translated as ‘grilled’, plancha is usually an iron flat (i.e. the flattop grill in many restaurants) and parrilla is an actually grate over a wood or charcoal fire and thus what most of us home cooks would consider “grilled”.

Fine, but what about mirador being translated as ‘lookout’. This is why I want you to do the Google search and see the photos. spanishdict.com translates mirador as either ‘enclosed balcony’ or ‘lookout’ which it turns out, from photos, both equally apply. This restaurant is at the top of a hill overlooking the river and adjacent valley, but it also has a wraparound enclosed balcony for dinners. Looks like a fun place.

I had planned on covering some more interesting bits from the menu but I’m out of time (other duties call) and so I close with the promise that I’ll get back to writing about menus (yeah, sure).

Still plugging along

Despite a lack of posts recently I’m still around and plugging along on my virtual trek. I seem to have injured my left toes so I had to back off on intensity of workouts. So to get roughly the same amount of calories burned I have to go a longer distance so actually my pace has picked up a bit. So I’ve reach 427.0 miles and seem to be near the tiny village of Peruscallo heading to Morgade. The Camino, since Ponferrada seems to have nicer facilities and certainly has nicer scenery. The comparable here would be like going east where it has more precipitation. Instead of looking like western Nebraska now this part of Galicia looks a lot like northern Missouri, the natural environment that is since the human part looks nothing like anything around here.

So in keeping with the thread of this post I’ll add a few more trail pictures of an area that is radically different than anything you’d find on the Camino.

First up, here’s the trail (this one I’ve actually walked):

This is the St. Elena Canyon in Big Bend National Park in Texas. You can see a few people on the trail headed into the canyon. The trail only goes a relatively short distance before a deadend but is a spectacular hike. Often you can also see many canoes on the river, which just happens to the the Rio Grande. IOW, the left side of the picture is Mexico. If the insane and ugly wall ever got built they couldn’t put it in the middle of the river so instead this trail would be lost forever (or have a gate in the wall so tourists can visit but then what’s the point of a wall with a hole in it).

So here’s an image of where the cars go (you could hike that road but I wouldn’t advise it).

Same river and you’re looking north, the USA side. Now try to imagine where you’d put a wall there. And no one would ever get to enjoy this spectacular sight-seeing drive in Texas.

Now the previous two pictures are along the river where there is a lot of greenery. But just a bit further north (and in this case also west) this is more what this area looks like:

I never really cared for or appreciated deserts until I visited the Big Bend area but it can be quite spectacular. At this time of year there are few flowers but on my first visit it had been an unusually wet winter and the wildflowers were overwhelming and gorgeous. Many places you can just walk out in the desert (outside the US National Park and the Texas State Park it’s all private land and not advisable to walk as locals don’t care for strangers and everyone has guns). But you have to be really careful and watch your step, first to avoid damaging the quite fragile growing things, but also, since almost every growing thing has thorns to avoid damaging yourself!

BTW: In case you’re wondering about my learning Spanish and studying menus in Spain, yes, I’m still doing it. In fact I’ve reached level 22 in Spanish at Duolingo.

Glossary Updated

This post describes a recent process to update the glossary found on this blog. I believe a reader should know how a glossary is assembled in order to know how much to  trust its accuracy so I’m trying to be as transparent about process as possible. Furthermore my glossary has two “biases”: 1) it is aimed at terms found in Spain, not any Spanish term from anywhere, and, 2) I (mostly) only include terms I’ve actually found on the hundreds of  menus from restaurants in  Spain I’ve collected and analyzed to create a highly curated corpus. So while the glossary has considerable effort in constructing it naturally it still has errors as it was manually compiled. But I believe it is one of the better and more exhaustive glossaries you’ll find, at least for free on the Net.

After eight more days of work since my post about this effort I decided to call it “done” and update my glossary page as version 4.0. The glossary gained about 150 items, had numerous errors corrected (especially spelling, especially accents), had some definitions changed or enhanced, and adopted my “syntax” to show all the forms of this word under under a single “lemma” (just learned this term from linguistics).

Despite all the work I did there are still mistakes, omissions, inconsistencies in the lemma representations and other errors. This is the challenge of manually editing a large amount of material, even while trying to be very careful. Each time I do this manually I learn a bit more about how I’ll have to create the software to create and manage a properly curated corpus which I’ll need for my translation application.

Not every term in this glossary is really a “translation” to English as often there is no translation. So instead, based on terms I have found in the many menus from Spain restaurants that I’ve analyzed as the “raw” data, I have sometimes had to supply a description instead of either a “definition” or a translation. For instance, I researched and added most of the names of grapes used in Spanish wines, olives used in tapas and cheeses used in various dishes. While one might translate Cabrales as “blue cheese” this isn’t that helpful so descriptions work better.

So almost every term in my glossary I have found in menus. There are more terms in the various glossaries I’ve found and assembled but unless I actually see a term used in a menu in Spain I can’t be certain some term from some other glossary actually applies to Spain. Or, of course, Spanish food terms in other parts of the world may mean something entirely different than they do in Spain and so I’m trying (as best I can) to focus on the vocabulary one would encounter in Spain.

I may do some more “fixes” or additions to this glossary but I don’t expect to do another major revision. As it is this is now one of the largest glossary you’ll find anywhere on the net (and perhaps the easiest to access, just a single, albeit, long webpage, not some more complex access scheme). So while this glossary, like anything you find on the Net, is easily available one should ALWAYS be somewhat skeptical as the editor is human and makes mistakes, so check with authoritative sources for any terms that might really matter for you.

A look at my drill application

Since I’ve mentioned this in multiple posts I thought I’d provide a little more detail. Here’s a screen shot with some food terms.

Ugh, WordPress is hard to get images right, hope this looks OK after saving. Good, for some reason the image looks bad in WordPress’ post editor but I chopped the screenshot to fit and it looks OK after posting.

BTW: Spanish readers out there will note kokotxa in this list which is really Basque, not Castilian which would be cococha.

Anyway, the basic idea is to load a random (though biased to get most effective drilling) set of words and then I visually examine them. Most drills do some sort of “quiz” but this is for me so I just scan the list.

If I don’t instantly know the translation I click the word. That gives me a score of -1 (otherwise if I don’t click a word it gets a score of 0, for appearing but “known”). I don’t “cheat”, since this is just for me, so I don’t need a quiz.

But if I have the least bit of doubt I click and then I see the translation. Then I decide: a) was this a mistake that I clicked and then click Ignore button, b) if I thought I knew the answer but was wrong, then I click the Wrong button and my score becomes -3, and, c) if I really didn’t know at all (or my “guess” was wildly wrong) I click the “no clue” button and get a score of -10).

After I’ve looked at all the words I click Done to record the results. Then I click Drill to get a new set of words (which is more likely to repeat wrongs with scores other than 0). I continue as long as I can stand and then click Save (unless I’m just testing code) and the scores are then added to the XML database.

And if I’m sure I want to record the results then I can use the File menu item to save a new copy of the the XML.  The XML Editor and XML Update are what I use to fix issues in the database itself.

All the drill results are saved in another part of the XML (eventually making it very large, hurrah for having lots of RAM to have all this in memory – I come from the days when RAM was scarce and had to do lots of programming tricks, now I just brute force all this).

Then I have an analysis routine (WIP) to consolidate all the scores over all the drill sessions to find out which words are worst (lots of mistakes, therefore drill more) and which are best (few or no mistakes, so only drill after some time has passed).

While I intend to create other types of drills this is “good enough” to have me looking at a fair portion of my vocabulary every day (todos los días) and thus keep refreshing my wetware memory. I can’t do this very long (so the magenta number on the screen shot is a timer of how long I’ve been doing drills, rarely do I exceed 20 minutes) because I’ll start having “short-term” memory (since my mistakes are more likely to repeat in the drill, by design) and so I begin to “know” them, but not really.

I’m focusing the drill (really the way I’ve created the XML database) on recognizing the Spanish, since, again, my goal is reading menus, not writing them. So my database is (now) poorly structured for doing English drills, which is harder than the Spanish drills, but more useful if I need to be able to ask questions about the menus.

And of course this is all “written” rather than spoken drills and to be really helpful I actually need to know how hablar a camarero but I’m getting there.

Back to menus; a big project

My primary purpose for this blog is to record my progress in developing an application to translate menus in Spain. I worked diligently on this for about nine months but then got into some side-trips in other projects. But now I’m trying to get back to that primary objective.

For 78 days now I’ve also been trying to actually learn Spanish via the nice online application, Duolingo. While this diverted me from my primary task it has been useful. My sister always thought my idea was silly and that instead I should just learn the language. That’s not a bad idea but it looked harder (and more time consuming) than my primary limited work just to read menus, based on the assumption I’d soon be heading to Spain to tour along the route of the Camino de Santiago. Therefore I needed results sooner than I could learn the language.

To build my application I’d first need a large corpus of terms from menus with accurate English equivalents. To do that I’d import the text from websites into a working document and crunch through all the terms. Often that gave me some interesting observations that I was converting to posts, hopefully also interesting to my readers. Obviously there are going to be mistakes in manually collating data so my corpus needed to be carefully curated, with the terms and my “guesses” at translation with a “confidence” factor. Then via the large corpus I could extract the accurate equivalent Spanish to English translations I’d need for the application.

That’s a long slog so a couple of times I went ahead and created a minimally curated “glossary” which I have as a page here at this site. In my searches I found a number of glossaries, or even dictionaries in Spanish, covering food. Years ago when I first got interested in these I just extracted all the glossaries I could find and manually collated them into a single glossary. It was a mess!

The trouble is that food terms in Spanish (my searches) yield results that either don’t apply to Spain’s food dialect or were just wrong. After all any other person who compiles glossaries makes mistakes too. Or I’d make mistakes extracting and collating them. And my lack of any fluency in Spanish meant I often misinterpreted the raw material I was attempting to organize. That previous experience convinced me I needed to be very precise about collating material AND focused on Spain as the source of the raw material and so my idea about creating a corpus evolved.

But in nearly a year I still don’t have that corpus. And without it I can’t build my application. And in the meantime I needed to get some “drill” code done since I reached the point where I was forgetting more than I was learning. And while Duolingo is fairly good for learning Spanish it’s not as good for repeating previous lessons (and their vocabulary). And repetition is the key to learning a language. So I found myself forgetting vocabulary I’d once before acquired.

So I set out to build a drill application, which has some of the same elements I’d need in the translation application. And like compiling glossaries I’ve done this also, in the past – the first time for Italian food terms. So I’ve built drill programs before with only limited success.

The key to a drill program is to be efficient and force me to do repetitions of the vocabulary I know the least well. That’s harder than it sounds. Plus most of the types of drill I did (glorified flashcards, a common language learning technique) took so much time that as my vocabulary grew my repetition, of any particular word, got less and less frequent. Even with an hour a day I could only repeat a fraction of the vocabulary I’d acquired.

So I had some ideas how to improve this and make the drill more efficient. But I needed data even to do the programming. So I fairly quickly assembled the glossary I posted at this blog without being too concerned about its accuracy.

So with that lengthy background now I can describe what I’ve more recently done and the “big project” I’m now doing. I built my first version of the drill application centered around the Duolingo vocabulary. As I’d do each lesson I would fairly careful assemble the “database” (a complex XML) to feed the feed program. For my Duo vocabulary that now contains about 1100 “terms” and 1400 “forms” of those terms. By forms I mean the usual four spellings of adjectives (in Spanish both gender and number) and the first set of conjugations for verbs. Getting all that going for Duo vocabulary drills got me a fairly useful and efficient drill program which is helpful as a supplement to Duolingo.

So then using that code and crunching the glossary I’d assembled here I started on the food terms. And that was a bit of a mess because the glossary sucked.

So to fix this I went back to my 30 or so working documents of all the menus I’d processed. Rather than the more difficult chore of extracting material for a well curated corpus I just quickly (a couple of days) just extracted all the accumulated Spanish. That’s a tedious chore but it does reveal some of the problems of getting “raw” material from the websites. Naturally I found lots of spelling mistakes (easier for me to recognize now that I know a little Spanish) but also the inconsistencies in gender and sometimes number. Also many instances of words are very inconsistent on the use of accents in the Spanish words. My Duolingo study also let me learn the rule that accents sometimes change (for real, not typos) in certain circumstances.

So once I’d compiled all my “words” from all menus I had about 10,000 “raw” bits that I was able to clean up, de-duplicate and consolidate (like all the forms of adjectives under a single “term”) and ended up with about 5500 lines.

Then in a separate process I took the latest (v3.3) copy of my glossary and then combined that with about six other glossaries. That was a chore and resulted in about 4000 entries.

So then I combined these, all the glossary “words” and all the menu “words” and started going through all that by hand. I’m now down with everything through M (since I sort all 9000 or so lines into alphabetic order). I’ve done a few hundred “fixes” to my glossary and about 100 additions. But more importantly all those changes are in my XML “database” for the drill program. With a bit of code I can then extract from that XML to create text I can paste into the glossary page here.

So when I’m finally done with all that tedious manual work I can update my glossary and it will be a big change so I’ll make that the v4.0 version which I believe will be quite a bit better than my current v3.3 but not as good as a curated corpus needs to be. And, really my glossary will then mostly contain words that exist in reference sources (several online dictionaries I use) and/or reconciliation with the other glossaries I found.

Please note, therefore, than my word product is fully derivative from many sources and my editorial work and thus constitutes “original” work. I’m quite conscious of never (almost never) posting anything in this blog that would violate copyright, i.e. the wholesale use of someone else’s glossary.

And now all my material is synchronized – my XML database for the drill program, my derived glossary with reconciliation to other glossaries or reference sources, and I’m only including terms in either place that I’ve found in menus so my product is more closely aligned with Spain dialect and I can exclude other Spanish food terms.

Now, while that isn’t done, I’m back into the code for my drill program. In the case of my Duolingo vocabulary I feed into the drill program I (mostly) know that vocabulary by memory. Duolingo is divided into lessons (aka skills) that require 40 actual drills (to pass the skill and unlock the next one) which means about 800 individual drills. At Duolingo I’ve now done 16,843 “XPs” over 31 skills. On average each skill introduces around 30 words (forms actually). So when I do my “refresh my memory” drills with that vocabulary I have relatively few words I ever mark as uncertain, or worse, “I’m wrong” or “I’m clueless” (really forgot). That means all the scoring I’ve done with that vocabulary has relatively few “errors” and my aggregate score on most terms is 100%.

In contrast I’m much worse on my new food vocabulary. As I’d work on menus I’d “learn” many words, but since I had almost no repetition of those (the most common words appear on many menus so that was my repetition) and I’d done none of my own drill. Now that I have something to feed my drill program I’m getting a lot more “bad” scores. That’s good and bad. It’s bad because it means I don’t know those words very well, by memory. It’s good because now all the scoring of the drills I record in the XML has a lot more data than the drills on Duolingo vocabulary.

So that means back to programming. How do I consolidate tens of thousands of individual drills into some sort of metric that rates each word in the vocabulary as to how well I know it (and/or don’t confuse similar terms). Because I want to drill myself on what I know the least. I don’t very much need to drill on carne or aqua or cerveza or a few hundred other food words and I don’t want to waste the limited time I have for drills (even less than my free time because drill is tedious and I can only tolerate a certain amount each day). So that’s now the algorithms I’m trying to develop so my drill program is even more efficient and therefore more useful.

So while I thought I’d be done with this by now I have probably another week to finish cleaning up my food vocabulary and enhancing up my drill program.  But once I’m done with that I can spend 15-30 minutes every day (or most days) so I get more of the food vocabulary into longer-term memory along with a growing Duolingo vocabulary. Thus I’d hope to have reasonable fluency within a few months so soon I may need to head to some Spanish speaking country to test myself.

Now, note, all this is “reading” (and less “writing”) Spanish. Hearing or speaking is an entirely different problem. But without mastery over much of the vocabulary actual conversation is pretty hopeless. I’d originally assumed I’d have no more audible Spanish than a few phrases and the rest I’d do through reading (plenty of time to study a menu, have to be fast to have conversation).

Now, finally, all this I’m just doing for myself, other than relating some hopefully “interesting” tidbits here in the blog. While I’ve built many software products over my working life all this I’m just doing for myself. But at least, as a derivative from this work, I do hope to end up with the best glossary for food terms in Spain here at this blog as my contribution to others who might need this.