Skip navigation.
Home
Write anything I want to write...

Computing

Final Fight Against OOXML

Microsoft just lost the vote on OOXML! However, the war is not over yet. (For details, see the news coverage at http://www.noooxml.org/) ISO has decided to move forward to a Board Resolution Meeting from February 25-29, 2008.

We just DON'T NEED OOXML!

Say NO! to OOXML

Say NO to OOXML

A hot topic the software community is talking about these days is about OOXML [en.wikipedia.org]. What is OOXML? Simply put, it is a file specification developed by Microsoft and is currently used by MS Office 2007. Instead of giving full support to the existing OpenDocument [en.wikipedia.org] ISO standard (with 700+ pages for OpenDocument 1.1), Microsoft proposes to make its OOXML (with a 6000+ pages long proposal) an ISO standard.

So what is the problem?

Countless. The standard proposal is not open: It was not created by a group of interested parties but by Microsoft alone. Since there is already the OpenDocument standard (which is already supported by OpenOffice.org, a free multiplatform and multilingual office suite), a dual standard adds costs and confusion. More importantly, OOXML is technically flawed: It has internal inconsistencies, uses non-standard conventions (which conflict with some existing ISO standards), has some inflexible formats, has ill-formatted XML examples, has errors in spreadsheet formula specifications (for example, it forbids any date before year 1900, also a bug found in MS Excel), etc.

So, we just DON'T NEED OOXML! You can find more information about objecting OOXML at NoOOXML. Currently, the format is undergoing a standardization process within the ISO. There are politics and dirty tricks in the voting process. For example, it was suspected that Microsoft bought the Swedish vote, and Microsoft admitted it! (See this [www.os2world.com], this [www.computerworld.com], this [www.groklaw.net], this [news.zdnet.co.uk].) As a result, the Swedish vote has been declared void.

What can you do? Spread the words, sign a petition here [www.noooxml.org], and make the world a better place!

Wonderful Talks about Free Software and Copyright

I have listened to the recordings of two fantastic talks about free software and copyright by two respected authorities—Richard Stallman and Eben Moglen. I would highly recommend the talks to those who are interested in knowing more about free software, GPLv3, the evolution of copyright, etc.

Copyright vs Community in the Age of Computer Networks by Richard Stallman
Hosted by University of Waterloo Computer Science Club
Audio (OGG) [www.csclub.uwaterloo.ca].

The Global Software Industry in Transformation: After GPLv3 by Eben Moglen
Hosted by The Scottish Society of Computers and Law in Edinburgh, Scotland
Text [www.archive.org]; Audio (MP3) [www.archive.org]; Video [www.archive.org].

My Computers

1. My Home Desktop

1.1. Hardware

Box A

Refer to this on my upgrade for this box.

Box B
Box C

(Obsolete)

1.2. Software

Operating System Ubuntu Linux 8.10 Intrepid Ibex Microsoft Windows XP
Usage More than 99.99% Less than 0.01% (Exclusive for gaming only)
Main Uses Surf websites; Check e-mails; Write programs; Watch movies; etc. Play games only for Windows (and don't run on Wine or VirtualBox)
Firewall Protected by NAT behind a modem router
Anti-Virus (Not required) AVG Anti-Virus Free
Defragmentation (Not required) Windows Disk Defragmenter
Web Browser Firefox IE
Office Suite Open Office; Gnumeric (Not used here)
Image Manipulation GIMP (Not used here)

1.3. Various Configurations

~/.bash_aliases
alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'

# ?? Extra
#export EDITOR=vi
export PS1='[u@h A w]$ '

export HISTTIMEFORMAT="%Y-%m-%d %H:%M "
export HISTFILESIZE=100000
export HISTSIZE=100000

export PATH=$PATH:./

alias emacs='emacs -geometry 80x38+0+0'
~/.mplayer/config
# Display
subcp=utf-8
#font=/usr/share/fonts/truetype/arphic/uming.ttc
utf8=yes
subfont-autoscale=2
subfont-osd-scale=2
subfont-text-scale=3
subfont-outline=5
osdlevel=2

# Miscellaneous
vf=eq2
sub-fuzziness=1
idx=1

Fix the latex2html Black Line and PNG Transparency Bugs

The latex2html program is a wonderful and popular package for converting LaTeX to HTML. However, it suffers from a black line and a PNG transparency bug even with the latest version 2002-2-1 (1.71). You should be able to reproduce the bugs and use the workaround below in Ubuntu 6.10.

The Black Line and PNG Transparency Bugs

Try to create the following bugs.tex and then use latex2html to convert it:

\documentclass{article}
\usepackage{amsmath}
\begin{document}
\begin{align}
\cos^2\theta+\sin^2\theta &= 1, \\
\cosh^2\omega-\sinh^2\omega &= 1.
\end{align}
\end{document}

Point your browser to bugs/index.html, and you may see something like this:

An illustration of the latex2html black line and PNG transparency bugs

Look at the ugly black borders and the grey background. The black lines appear in some images regardless of the image type (either GIF or PNG) you use. (Note: You can select the image type by passing the -image_type gif or -image_type png option.) It seems to appear only when I use the amsmath packpage, so the programming bug probably resides in amsmath.perl. Besides, some PNG images don't have a transparent background.

The Workaround

If you use the GIF format, you can refer to the solutions (here and here) by Clay S. Didier. To fix PNG images, you can use my workaround below, which is partly based on Clay S. Didier's idea. I personally recommend sticking to the PNG format because it is technical superior for such equations, and does not suffer from any patent problem like GIF does. (See the League for Programming Freedom GIF page for details.)

The workaround fixes the problems by post-processing the problematic PNG images generated by the latex2html. Create the following the eqnfix.shscript for fixing the images. Note that the script depends on some PNM utilities. In Ubuntu, they can be installed from the official netpbm package.

#!/bin/sh
# Fix the latex2html black line and PNG transparency bugs
#
# After latex2html has generated the PNG images, run this script
# for each directory that holds the problematic PNG images.
#
# Changes:
#   11-Jan-07: Included a fix for PNG transparency and some small improvements
#   15-Sep-05: Created this script

# Get the directory path
if [ -z "\$1" ]; then
  cat <<EOF
Fix the latex2html black line and PNG transparency bugs.
Usage: \$0 <directory_with_problematic_PNG_images>
EOF
  exit
fi

# Fix each PNG image
for file in \$1/*.png
do
  echo Fixing \$file...
  pngtopnm $file | \
    pnmcrop -black | \
    pnmtopng -transparent "#B3B3B3" > img_fixed.png
  mv -f img_fixed.png \$file
done

Simply invoke the script for each directory that holds the problematic PNG images. For the example above:

\$ latex2html bugs.tex
...omitted...
\$ eqnfix.sh bugs
Fixing bugs/img1.png...
pnmtopng: 16 colors found
Fixing bugs/img2.png...
pnmtopng: 11 colors found
Fixing bugs/img3.png...
pnmtopng: 17 colors found
Fixing bugs/img4.png...
pnmtopng: 9 colors found

After running the script, you should see the following in your browser now:

After fixing the problematic PNG images generated by latex2html

It's obviously what you want, right?

Remarks

The fix above post-processes the PNG images and is thus not a really proper solution. The fix can fail in some rare situations if you play around with the colors of the font and the background. The best solution is to fix the source code of latex2html directly. It can be tedious to fix the bug, and I'm also not an expert in latex2html source and Perl. Please share if you have a better solution. Your feedbacks are welcomed.

Solving MySQL Problems on Christmas Day

Disaster

In the morning of Christmas 2006, I visited this site and encountered a 'disaster'... the Drupal engine complained that it failed to connect to the MySQL database server! Did somebody attack my site? How should I repair it?

I logged in via SSH remotely, and found that the MySQL server had been shut down abnormally. I restarted it and the site was alive again as usual! But when I wanted to check the Drupal logs, the page didn't load... no, I waited and found that it could load but took more than 10 minutes! What happened?

Diagnosis and Repair

I suspected that there were some problems in the Drupal database. First I checked the number of entries in each table with: (Masked off some critical things here. You think I will show you my password huh?)

\$ mysqlshow -uxxxx -pyyyy --count drupal

And I got it --- the watchdog table contained more than 1 million rows! I used mysqlbinlog to find out all recent queries. Well, Drupal had logged more than 1 million entries of PHP errors concerning the failure of opening some ancestor directory that never exists (which involves something like opendir('drutex/../../../../../../../.. (repeated many times) '). To restore the database performance, I did

mysql> DELETE FROM watchdog WHERE message like 'opendir%';

and I had only about 1,000 rows left in the watchdog table. Surfed the logs page again, but it still took more than 10 minutes! Something still went wrong but I suspected that it's still related to the database. So I modified the function _db_query() in the drupal file includes/database.mysqli.inc to record all queries and the time taken, and added something in index.php to store the results:

if (\$_SERVER['REMOTE_ADDR'] == 'ip.of.my.computer') {
  file_put_contents('/somepath/result.txt', print_r($queries, 1));
}

Surfed the logs page again, and examined the output file. There were 3 queries where each took about 280 seconds, and all of them involved the watchdog table! To confirm the problem, I performed a simple SELECT COUNT(*) FROM watchdog; and it took about 280 seconds. So the table must have been corrupted somehow.

I did a CHECK TABLE watchdog MEDIUM. After 15 minutes, MySQL told me that the table was basically ok. I think that it was not and would get a different result if I had use the EXTENDED option to make a full scan. In fact the problem could be so subtle that it escaped the scan with the MEDIUM option. But I did not want to waste the time and so I just asked MySQL to repair it for me:

mysql> REPAIR TABLE watchdog EXTENDED;

The EXTENDED option rebuilds the table somehow from scratch, and is not generally used. But I believed that the problem was severe, and I was desperate too! There was no harm using the option but it's expected to take a long time to repair. Luckily, it only took about 6 minutes. I tried

mysql> SELECT COUNT(*) FROM watchdog;

again and I got the result in less than a second!

Wrap Up

What had happened here? I haven't delved into it much (yet). But even that it's an attack, it's not severe as no data is lost. Besides, I did regular backup of the site. For security, I upgraded all software involved to the their latest versions, including MySQL, drutex, etc. The binary log of MySQL and the access log of the Apache server recording the disaster are still on the server. So if I want, I can do some tedious forensic work here, and use it against the cracker in the court if there's any... Dare you crack me!

Build Your Own Ephemerides

1. Introductions

This is a rough guide on how to build or implement your own ephemeris program or library routines. There are some existing programs (and some of them are free) to do such a job. However, there are a few advantages if you build your own one.

  • It's your thing. You have complete control of the licensing and the copyright of the program. You can also improve the programs anyhow you like.
  • Good way to learn Spherical Astronomy. Of course, you need to like mathematics and astronomy. If you have no interest in these subjects at all, it's better to use an existing program or libraries routines for your purpose.
  • Extend it for many other purposes. Since you are familiar with your program, you can easily extend it for many purposes such as (a) calculating the phase of the Moon; (b) predicting eclipses; (c) building your birth charts for many different traditions of astrology; (d) creating calendars (such as the Chinese calendar) that are based on astronomical events; (e) many other things.

While these sound interesting, you have to note that building your very own ephemeris is not for the faint of heart! If you are not familiar with spherical astronomy, be ready to learn a lot of concepts and technical terms related to the subject. As for implementation, expect to write

  • a few hundred lines just for very basic set of features and low-precision calculations;
  • a few thousand lines for more features and satisfactory accuracy;
  • a few ten thousands lines for more complete features and high-precision calculations.

There are some easier ways to tackle this problem. For example, you can interpolate the positions of the planets from some existing ephemerides (such as DE405, DE406). Such an approach is definitely feasible and realistic. But here we focus more on performing the calculations "from scratch" which is useful in many cases. E.g., to calculate just the rough rising and setting time of the sun for a few centuries, it's better to stick to a simpler and less space-consuming low-precision calculation rather than to make use of the gigantic ephemerides.

Let's start.

2. Knowledge Required

Before starting your implementation, there are many concepts you need to understand and study. Here's a list of some important ones.

Mathematical Software Library Routines. In particular, you need to dial with a lot of trigonometric and logarithm calculations. Be very familiar with these to avoid making stupid mistakes. For example, a trigonometric function generally expect an angle expressed in radians, so don't use an angle expressed in degrees directly.

Concepts in Spherical Astronomy. You need to know a lot of thing specific to this subject. Knowing some popular concepts in astronomy like the big bang or the dark energy won't help you here, as we are dealing with locations here! Below is a list of some common concepts just to give you an idea what to expect. Note that the list is NOT exhaustive! The higher precision you want to calculate and implement, the more and the deeper of these concepts you need to understand.

  • Time Tracking. TT/TDT (Terrestrial Time); ET (Ephemeris Time); ST (Sidereal Time); UTC (Coordinated Universal Time); UT (Universal Time); Delta-T; Julian day; Julian ephemeris day; Gregorian calendar; etc.
  • Coordinate Systems. Horizontal coordinates (altitude, azimuth); Equatorial coordinates (right ascension, declination, equinox, solstice, hour-angle, transit, meridian); Ecliptic coordinates (longitude, latitude, obliquity); Galactic coordinates; Heliocentric; Geocentric; Topocentric; etc.
  • Reference Frames. ICRF (International Celestial Reference Frame); FK5 (Fundamental Katalog 5); epoch; mean equator and equinox, etc.
  • Orbital Elements. Perihelion; Aphelion; Perigee; Apogee; Eccentricity; Kepler's equation; Mean longitude; Mean anomaly; True anomaly; Equation of the center; etc.
  • Miscellaneous. Astronomical Unit; Radius Vector; Aberration; Precession; Nutation; Parallax; Refraction; Twilight; Conjunction; Opposition; True/Geometric position; Mean position; Apparent position; etc.

Mathematics. To understand some basic calculations, you need only high-school mathematics. Be ready to deal with a lot of trigonometric functions and analytical geometry. But to deal with very high precision calculations, you may need to know more on calculus and astrophysics.

Where can you learn these concepts? Look for some good books or Internet resources related to this subject. I myself have learned a lot from [PDS] and various Internet resources. To learn spherical astronomy seriously, you can start with [WMS].

3. Guides for the Algorithms

3.1. Notes on Accuracy

Decide carefully the level of precision you need based on the purpose of your implementation. If you just want to use it to draw a simple sundial, then low-precision calculations are sufficient. But if you want to launch a rocket to the Moon or to visit your ET friends many light years away, then stick to the best algorithms known to human beings.

You may wonder, why don't we just go straight to the 'exact'/'perfect'-precision implementation? With such an implementation, we can then extend it anyhow we like. It's true, but there are some pragmatic issues. First, the state-of-the-art theories generally involve a lot of tedious calculations. E.g., ELP2000-82B is one of the best theories for calculating the position of the Moon. To calculate the position of the Moon in any given instant, you need to perform over 30 thousand trigonometric calculations. So... Second, storing such information takes space. You may not want to include such data especially for a portable device. Third, implementing the best solutions involves much more efforts in implementation and testing.

We will briefly mention the kind of applications you can do for calculations with different levels of precision. An interesting trade-off on the precision is to implement some calculations with different levels of precisions. Use only the one you need for your applications. E.g., to predict solar eclipse, you can first estimate the date and time a solar eclipse may occur by using low-precision (but faster) calculations. Then, use high-precision (but slower) algorithms to calculate the exact times related to the eclipse.

Last but not least, you need to know the maximum error introduced by each algorithm. With this, you can tell when the algorithm fails. For example, if an algorithm tells you that a new Moon starts from 11.59pm tonight, but its precision is low and may incur up to 10 minutes of maximum error in time, then you can't tell whether a new Moon starts today or tomorrow. Such a case may not affect you if you want to observe other stars on a new Moon night, but it can be extremely important if you want to calculate an astronomical lunar calendar. To resolve such a case, you need to resort to high-precision calculations.

3.2. Low-Precision Algorithms

To find out the rough locations of the planets at night, to design a normal sundial, to include a cute moon phase indicator in your PDA, to design a compass based on the sun and moon positions, or to know the rough rising and setting time of the sun and the moon, low-precision calculations are sufficient.

There are a few good places to learn such calculations. The book [PDS] covers a lot of useful concepts and low-precision algorithms. The following websites contain good resources too: Approximate astronomical positions, How to compute planetary positions.

With this level of precession, you normally don't need to distinguish between the TT (Terrestrial Time) and UTC (Coordinated Universal Time), among the various reference frames, and you need only simple linear interpolation.

3.3. High-Precision Algorithms

To calculate an accurate birth chart for the modern western astrology or chinese astrology, to calculate lunar and solar eclipses with satisfactory results, or to calculate a calendar that is based on astronomical events, you may need high-precision algorithms. In particular, knowing the maximum possible error that can incur could be critical for some applications.

A good book that includes such algorithms is [JM]. The book introduces various important concepts. The calculations of the planet positions are based on an bridged version of the VSOP87 theory, which is one of the best planetary theories we have. While the calculations of the moon position is based on an abridged version of the famous ELP 2000/82 lunar solution.

You need to distinguish among the time systems here. Don't get confused between TT and UTC. You need to implement the routines to convert between them. A proper interpolation may be necessary to obtain good conversion results. You also need to distinguish among the true, mean, and apparent positions. Be careful to deal with nutation, precession, aberration, parallax, refraction, etc.; and convert among the different reference frames properly.

3.4. "Platinum" Algorithms

If the calculations above don't suit your purpose because you want to design a bullet-proof astrological birth chart, to calculate astronomical calendars, to predict eclipses to an accuracy in the order of milliseconds, or to navigate your spacecraft to another galaxy, then you need to study more and implement very high-precision algorithms.

You can find a lot of great resources from the wonderful Internet. Some useful web sites are listed here.

There are more things you need to take care of. A cubic spline interpolation for Delta-T (= TT - UT1) would be great. You may also want to learn more about astrodynamics and numerical integration.

4. References

[JM] Jean Meeus. Astronomical Algorithms. 2nd Edition. December 1998. ISBN 0-943-39661-1.
[PDS] Peter Duffett-Smith. Practical Astronomy with Your Calculator. Third Edition. March 1989. ISBN 0-521-33699-7.
[WMS] W. M. Smart. Textbook on Spherical Astronomy. 6th Edition (Revised by G.M. Green). 1977. ISBN 0-521-21516-1/0-521-29180-1.
Also take a look at the links I've bookmarked under the Astronomy category on this site.

Syndicate content