The Old Blog Archive, 2005-2009

Archive for the 'tekhnologia' Category

That 1904 Thing

When we were planning for TapExpense 2.0, we have made the decision to make Excel® spreadsheet export the No. 1 priority. This was harder than we had thought.

An expense tracker without export function is quite a dead box. That’s why almost every iPhone expense tracker you find in that category provides CSV export. CSV stands for comma-separated values. It’s a very simple, plain-text format. Each line of the file represents a row, columns of which are in turn separated by commas, hence the name.

Different iPhone expense tracker provides different way of exporting. TapExpense mails the CSV to you as the body text of an email. This requires no installation of other third-party software (as some competitors do). On the other hand, you need to open the email client on your desktop computer, copy-and-paste the body text into another text file, save it as a .CSV file, then open it up in your spreadsheet program. It’s a bit of work.

That’s why we decided that the best way is to export directly to Excel spreadsheet file, mail the file to you. You can then open the file in your spreadsheet program, preview it in your Mail.app or Mobile Mail, or if you use Gmail, you can view and use the spreadsheet directly. CSV does not have such convenience.

If you’re a Windows iPhone user, that’s the end of the journey. If you’re a Mac user, using Microsoft Office for Mac, you might want to read further to understand a curious design in Excel called the 1904 problem. This is important when you try to copy-and-paste the exported spreadsheet into one of your existing workbook.

A Very Short History

As it turns out, Excel actually has two date systems. One is the 1900 date system, another the 1904 system. Why there are two systems is purely historical (see this Microsoft article for details). When Excel first came out on PC, it had to support the date system that was used by Lotus 1-2-3, then the most popular spreadsheet program (later dethroned by Excel). Excel for Mac was not designed to have such compatibility in mind, so the 1904 date system was used instead. A detailed (and more technical, programmer-oriented) explanation can be found in this excellent blog entry by Joel Spolsky.

So How Does It Affect Me?

Excel’s two date systems have an inconvenient consequence: You cannot copy-and-paste the dates from a spreadsheet using the 1900 date system to another spreadsheet using the 1904 date system (and vice versa). To make things worse, there is no way you can save a 1900-based spreadsheet file as another 1904-based file (and vice versa). There are both legitimate and historical reasons for Excel not to do the conversion for you, so we are not here to blame any one at Microsoft.

Now here’s the thing: TapExpense exports 1900-based Excel spreadsheet files.

Because a majority of Excel spreadsheets in the world are created by Excel on Windows, and if you’re a Mac users, chances are that you open up a lot of Excel files created by Excel on Windows. To ensure the maximum interoperability, we have decided to use the 1900 date system.

But! If you want to integrate the data on the exported file to your own workbook created by Excel for Mac, you’ll immediately notice that every date that is pasted to your workbook is advanced by 4 years and one day. That’s no good! (Click for enlarged version of picture).

Pasting 1900-Based Date Data to 1904-Based Worksheet: The Problem

The Solution

Fortunately, all is not lost. If you are pasting data from TapExpense-exported worksheet to your workbook that is created by Excel for Mac, and you found the dates are incorrectly advanced, here’s what you need to do:

Step 1. Enter the value 1462 in an empty cell (any place will do). It may appear as 1/2/08, which is fine (click for enlarged pictures).

Pasting 1900-Based Date Data to 1904-Based Worksheet, Step 1 & 2

Step 2. Select the cell, the press CMD-C or click on the menu Edit > Copy.

Step 3. Select the dates you want to correct. Then click on the menu Edit > Paste Special.

Pasting 1900-Based Date Data to 1904-Based Worksheet, Step 3.1
Pasting 1900-Based Date Data to 1904-Based Worksheet, Step 3.2

Step 4. A Paste Special dialog box shows up. Click the option Values, then click on the Substract option.

Pasting 1900-Based Date Data to 1904-Based Worksheet, Step 4 and 5

Step 5. Click Ok.

Step 6. You can delete the 1462 cell now.

Pasting 1900-Based Date Data to 1904-Based Worksheet, Step 6

This in essenece substracts 1462 from every pasted date cell. 1462 is the difference between 1900/1/1 to 1904/1/1. That’s it!

(Some of you might have noticed that 1462 = 366 + 365 + 365 + 366, but 1900 is not a leap year! You’re right, but that’s an unfortunate bug in Lotus 1-2-3, it’s a fait accompli and the 1900 date system will stay that way.)

The above-mentioned Microsoft article also has step-by-step explanations on how to work with the two different date systems.

What We Have Also Taken Care of

The Excel spreadsheet that TapExpense exports from your income/expense records uses the regional settings on your desktop computer. These regional settings are observed by all major spreadsheet programs (Excel, Numbers®, OpenOffice.org, etc.) So if your use the German format in Germany, the amounts will be in the format of 1.234.567,89 (period for the thousand separator, comma for decimal point). The dates are also properly formatted (so 12/31/09 in the US format becomes 31.12.2009 with the configuration above).

In fact, if you prefer the CSV format, you’ll notice that TapExpense 2 also does the CSV right–dates correctly formatted, amount numbers using the right separators, and your Excel or Numbers will understand it when you also use the same regional settings as your iPhone or iPod Touch. These are some of the details we have taken care of when we were working on TapExpense.

We’d Like to Hear From You

So in short, if you are a Windows user, or if you work with Windows-created Excel spreadsheets (you can also set your Excel for Mac to use the 1900-base as the default date system), there’s nothing you need to do when you copy the date from TapExpense-exported sheets. If you need to copy date to a 1904-based workbook, just follow the simple steps above, and all will be fine.

That’s pretty much about it. Do you have any suggestion that we can make it better? We have thought about an option (such as “export 1904-based workbook”) in TapExpense, but given the length the extra explanation might have to get into, that does not seem to be a good design. Or does it? We’d like to hear from you.

OpenRadar

A couple of veteran Mac developers have come up with the idea of OpenRadar. Anyone who has ever filed a bug to Apple through their bug reporting system (“Radar”) knows its close nature. It’s understandable that you don’t want to disclose your reported bugs on unreleased software from Apple, neither do you want to reveal a bug that has to do with your own unreleased software. So Apple chooses to make it close on fair grounds.

On the other hand, there are situations where you want to make your filing public. I often encounter these two: (1) you have a feature enhance request, but it’s rejected on the ground that it’s a “dup”, but you don’t know if it’s really so, or you want to seek some support from your fellow developers; (2) you want to discuss openly about a bug or a fix which may benefit all.

At any rate this is community spirit at work. It’s amazing to see how Google’s App Engine is used. Will we be seeing a desktop client for that soon, too?

Update: Tim Burks, the person behind the idea, blogs about it in his own words.

Random thoughts, C++0x, transactionable GUI

In-between the problem of structuring your GUI application at higher order has caught my attention. The glaring problem of multi-core multithreading is trapping us developers. We are bound to the limitations of modern graphics system design–to our amazement window systems are themselves very complicated beasts. But while our computation and heavy-lifting parts are migrating to the multithreaded world (hey even network connections are in that area already too), GUI code is still confined to either main thread or the thread that created the component/widget in question. You don’t, can’t, dare not, ought not, may not pass them back and forth between different threads. How quaint.

In other news C++0x shows its promise with all its new syntactic constructs (many of them condensed form of popular idioms, esp. template metaprogramming) and its inherent support of modern threading control objects. And talk about lambda and better type inference. Hey but a working compiler is still years ahead. And will we be still using gcc?

(But I guess I’ll still have problem doing GUI stuff with such a strongly-typed language.)

The real challenge is how to formulate, in my opinion, GUI event and graphics model as trasnactionable operations. Animation frameworks like CoreAnimation show the way, but it doesn’t seem to have become that much generalized yet. GUI programming is in a desperate need of new models that can catch up with the already more advanced, more sophiticated way of (esp. networked) data modeling and manipulation. It’s such a scandal that database operations can be thought as series of rollbackable transactions but GUI ops cannot.

I’ll be happy to see the death of the decade-old event-loop model and its rebirth as a higher order construct. Although that seems a call for declarative UI design but perhaps we’re still far away from that.

[ANN] InputMethodKit Backporting Component for OS X 10.4 Tiger

InputMethodKit Backporting Component, or IMK-Tiger for short, helps
input method developers backport their IMK-based input method apps to
OS X 10.4 Tiger. It is what we use to backport the latest OpenVanilla,
a popular input method toolset in Taiwan, to 10.4.

I’ve posted more details on Cocoa-dev.

Against To-Do Lists

I always have problem with the various methodologies that teach you organizing your own to-do lists. I often wonder what the to-do list of an achieved artist, architect or designer looks like. I even wonder if they ever come up with to-do lists at all.

Don’t get me wrong. It’s not that I don’t make to-do lists or I suspect others don’t. As a programmer I can’t value the tools too much. Many professions, software development included, require team work and project management. You need all kind of techniques and measurements to ensure the things get delivered.

But I sometimes really question the philosophy of compartmentalizing your core activity, what to-do lists or what the more sophisticated (and widely marketed) schools teach you.

The problem is many activities, creative ones especially, are not the ones you can desire. There are, if you listen to yourself careful enough, I believe, moments in life when you don’t feel the pressure that you have to do this and this in order to reach the goal. There are moments when you just feel the urge to do something. And there are, sometimes, those very rare moments when you just do something, and only after its accomplishment that you realized you did it. When such moments come, be they the “feeling the urge” mode or the “post hoc realization” mode, any to-do item becomes self-evident and natural, and there is no such “I have to do this” pressure on it.

That’s why I wonder what people’s to-do lists look like. I’ve tried a few personal organizing tools and methodologies, and I was very bad at adopting them. Sometimes I even felt I was adapting myself to them, that is, by definition, modifying my own modus operandi–even though I wasn’t sure what it really was–to fit in their molds.

It turned out that I’m an organized but indisciplined person. That’s such an oxymoron. I’ve tried, twice successfully, to run a period of my life waking up every day at 7 am, jogging for 3000 meters, and starting to working in the morning, and calling it a day by sunset, to my benefits–I did a lot during those two periods. To get things done. There were a lot of to-do lists. But I wasn’t very happy because of that. Later on, I found myself organizing best when I oscillate between paper to-do lists, post-it notes, OmniOutliner, text editors (and I use four: SubEthaEdit, TextMate, TextEdit and vim) and Notes.app on my iPod Touch. Usually I run my “to-do app” on one of those tools for a while, then move on to another in the next period, and the cycle goes on. That’s what I mean by “indisciplined organization”: I’m not bound by an overarching methodology to run my own life. And I am happy with it.

But there are sometimes those periods of life where I was occupied entirely by one project, or sometimes the situation became that I was so busy and I didn’t even have time to do to-do list. I was knocked out of doing them. And many times the post hoc realization has been that I was even happier because I didn’t need to be driven by to-do lists.

That’s how I start to wonder if there are differences between “having to do”, “wanting to do” and just “doing”. In any case I become more skeptical about the promises that organizing methodologies make, because they probably don’t work well for everyone.

Yea, perhaps I’m an oddball.

Thoughts on Redesigning a Framework

(I haven’t been in the writing lane for a while. In lieu of stuff about life, here I repost an entry from the ObjectiveFlickr blog.)

I haven’t really been taking care of ObjectiveFlickr.framework for a while. For the past few months many things have demanded my attention. In between I’ve attended sfMacIndie Soirée 2008 and been to this year’s WWDC too. How time flies! I want to apologize for my late response on everything regarding the framework.

Lately we’ve seen fresh influx of discussions on the mailing list. Reading them, I always have this feeling that “it’s time we’ve got to update the framework.” There are a few things that ObjectiveFlickr needs to do better. Some of them are the result of operating system and development environment changes. Here they are:

  1. Better and clear run loop support
  2. Proxy support in OFHTTPRequest
  3. Fixing the delegate implementation–delegate should never be retained
  4. Support for both 10.4 and 10.5 targets
  5. Properties
  6. Linkage against CommonCrypto instead of OpenSSL (libcrypto)
  7. In with NSXMLParser, out with NSXMLDocument
  8. Support for the-device-and-the-OS-that-shalt-not-be-named-until-July-11th

Many of the items actually have to do with OFHTTPRequest and OFPOSTRequest, two nifty (I think) wrappers of Cocoa’s NSURLConnection (for receiving data) and CFNetwork‘s CFHTTP stack (for posting data with progress callbacks). I use them all the time in many of my Cocoa projects, but even they feel a bit rusty now.

The removal of OpenSSL and NSXMLDocument dependency has also clear reasons (or, reasons-that-shall-not-be-mentioned).

I’m thinking of a new HTTP request class that solely depends on CFHTTP stack and does not use NSURLConnection. Which means that part needs to be redesigned. The existing OFFlickr* class interfaces look fine, but they’re also a bit wordy compared to their Ruby counterparts, ObjectiveFlickr-Ruby.

Should I create a set of new interfaces that break with the past, or should I maintain the interfaces and swap the internals? This is the question that is troubling me now. I appreciate any feedback on those design decisions.

Helveticul

Bought a pin at a museum shop, it says “Helveticul”. The salesperson, who just like many others are bilingual, confirmed my guess: it combines the word Helvetica with the French insulting word (think of “en-” plus the word in question then verbalize it). In a genius stroke it becomes a subtler message than Helvetica the Film‘s official “I Love Helvetica” and “I Hate Helvetica” pin-pair. In Helvetica.

So I put it on my backpack and enjoy the love-hate relationship with the typeface, à la française.

Some History on the .cin Format, and on Apple’s .cin Support

Eric Rasmussen of Yale Chinese Mac has started a discussion on the .cin support in Apple’s Mac OS X Leopard, an addition to their exisitng input method framework as an alternative to help users create their own input methods. I was invited to share what I know about the format, so I wrote a long reply to Eric’s post. The length of the follow-up seems to warrant a standalone blog entry, so here it is. I’ll put more links in the text later.

History of the .cin Format

.cin was first introduced by Xcin, an input method framework for X11 developed in the mid 1990s, as a data format for table-based input methods. By table-based I mean input methods that can be implemented, or seen, as a table look-up mechanism. Around 90% of input methods (Chinese and beyond) can be implemented that way. Apple’s .inputplugin also belongs to that category. Almost every mainstream input method framework supports at least one form of user-customizable IME creation. .cin seems to have become one of the standard data formats because it’s simple and many user-generated tables are already in wide
circulation.

I have very limited knowledge of Xcin and other frameworks, but in the early days, .cin was intended as a source format, not to be consumed directly by input method framework (or more precisely, the table-based input method “generator”). Also back then a .cin could use any encoding recognized by the framework. So phone.cin (renamed to bpmf.cin in OV) was encoded in Big5, pinyin.cin in GB, and so on. When we were developing the “generic” module (first named OVIMXcin, later renamed to OVIMGeneric) to support .cin in OpenVanilla, we made two decisions: first, we no longer require user to run a compiler/ converter to make .cin into a binary format, as it was so, which means the .cin is consumed by the input method module directly. Second, all .cin files must use UTF-8 encoding. This opened the door to bigger character set and the famous “♨” input method.

What’s in a .cin?

So what constitutes a valid .cin file? For OpenVanilla, a .cin file consists of three sections:

  1. A header consisting of directives beginning with “%”, like %ename, %selkey, %endkey. Some of them are like meta-data, some of them are controlling directives;
  2. a keyname block between the directives “%keyname begin” and “%keyname end”. This tells the generic input method to map the key typed to a character displayed in the composing stage (mostly to represent radicals in radcial-based input methods), and
  3. a chardef block between the directives “%chardef begin” and “%chardef end”. This is the body of the data table. “chardef” is somewhat an anachronistic misnomer. It used to define the relationship between key sequences to characters (hence the name), but modern implementations like OV and gcin allow phrases in this block

Different frameworks have implemented the details somewhat differently. OV’s implementation disallows the use of Windows-style CR LF (so only the UNIX-style \n is used, and that’s also what OS X uses), and comment lines (beginning with #) is not allowed in the chardef block.

Although .cin contains enough information for key-character/phrase mapping, but many input methods (like 倉頡 Cangjei/”Changjei” or 簡易 Simplex/Jianyi) require finer control. For OpenVanilla, the control is provided in the form of input method preferences (with some mind- bogging names like “force composition when reaching maximum length of radical” or “use space to select the 1st candidate). Different input methods require different controls (and those are a must — failure to provide those controls yields barely usable input methods). gcin
differs from OV’s implementation in that it allows those control directives to be expressed as a .cin header, with its own directive extensions.

OpenVanilla’s Own Take of .cin

OpenVanilla’s repository of .cin is available at: http://openvanilla.googlecode.com/svn/trunk/Modules/SharedData/

Zonble has written an excellent tutorial (in Chinese) on how to create
your own input method by writing up a .cin, which is kind of standard
text now: http://docs.google.com/View?docid=ah6d8th954vw_201fd5dkx

Technically .cin is really just a set of key-value pairs with its own convention. OV makes heavy use of .cin as a format. Things like reverse radical/pinyin lookup or associated phrases are also done with .cin-based data tables. I see it a good sign that Apple adopts a popular (and mostly consistent and cross-framework compatible) data format for Leopard.

Leopard’s Support of .cin

So what about Leopard? As far as I know, dropping in a UTF-8-encoded .cin into ~/Library/Input Methods or /Library/Input Methods then re-login just works. A new input method, using the name defined in the .cin, shows up in the Input Menu tab of the International preferences panel. I’m not aware of any per-method level control so far (I might be very ignorant on this).

In terms of limitation, I’m not aware of that either. OV’s own implementation (and many others) is only limited by memory and your patience (loading a .cin with 200,000 entries on a G3 is no small thing; a database-backed design will solve the problem). Leopard’s own take should not differ much. So it should be very flexible and easily customizable.

Cover-Flowize Your Application

People on the cocoa-dev mailing list talked about the cover flow API, which does exist, albeit not in public. If you know how to use IKImageBrwoserView, you already know how to cover-flowize your application.

In your Interface Builder project, drop in a custom view and make its class IKImageFlowView. In the data source class (which has the same form as IKImageBrowserDataSource), implement these two required methods:

  • - (NSUInteger)numberOfItemsInImageFlow:(id)aFlowLayer
  • - (id)imageFlow:(id)aFlowLayer itemAtIndex:(int)index

And there you have it:

Cover Flow Study

I showed that in yesterday’s CocoaHeads Taipei meet-up. The sample code is available here.

As this is an undocumented class, so caveat programmor. It may change in the next version’s OS X or simply be snapped away under your nose.

OpenVanilla 0.8.0: Now for Leopard

Today we announce the release of OpenVanilla 0.8. It is the third major version of OpenVanilla (0.6, 0.7, 0.8) since 2004. Version 0.8 is available for Mac OS X and Windows. For the unpatient, the new release is available at openvanilla.org.

One thing that I’d like to mention is that the OS X version comes in two flavors–one for OS X 10.4 and above, and one solely for OS X 10.5–the big cat that is finally unleashed today, and that’s why we choose today to announce its release.

LeopardVanilla

OpenVanilla 0.8 for Mac OS X Leopard has a redesigned engine under the hood. From the appearance it feels just the same as the non-Leopard version, but because Leopard offers us a more clear, easy architecture of developing input methods, we’ll start migrating the entire OpenVanilla framework accordingly. Other than maintain releases of 0.8.x, the next major version of OpenVanilla on OS X will be for Leopard (and above) only.

The major changes of version 0.8, in short words, includes:

  • better visual design elements, including a redesigned candidate window and a set of new icons;
  • redesigned web-site and user manual;
  • the latest version of libchewing for Chewing input method, with an expanded phrase database coverage;
  • wildcard support in Array and Generic input methods, and
  • Support for 3rd party on-screen keyboardlets.

Once again, for download and other information, just visit openvanilla.org.

Next »