Month: February 2016

MAME is more than an emulator
Like many people out there, I discovered MAME, the Multiple Arcade Machine Emulator, while looking for an emulator program for these old arcade games, like pacman, galaxian, outrun, and playing again with these games was a refreshing experience… A couple of years ago, I discovered a twin program, MESS, that shared most of MAME source code, but evolved in another direction, until recently when both code base got merged. MESS differs from MAME as it targets primarily the old computers from the seventies and the eighties, Commodore, Apple, Oric, ZX81 and so on. MESS emulates dedicated peripheral devices, that were specific to these machines, like magnetic audio tape devices were used as mass storage units, floppy drives, keyboard, light pen… The principle is the same than with MAME, you obtain the ROM files containing the firmware for these old computers, typically a basic interpreter is hard-coded in ROMs, as well as some code written in assembly language to interact directly with the hardware (draw something on screen, scan the keyboard matrix, write to or read from the tape device, or a floppy drive, play a sound). MAME has a particularity that makes it very powerful: it emulates each hardware chip that composes a machine, and build an emulated machine by just describing the way the chips interact together. It has many advantages:
- it is a natural and efficient way to factorize source code, because the behavior of a chip only needs to be written once. It can be debugged by more people, including all those interested in a machine containing such a chip;
- the chip must be emulated in a fine grained way, respecting timings, latencies, covering all the features described in the datasheet of the chip, because all machines will not necessarily use all features of a chip, or will use some in a specific way. The emulation of the chip has to cover all the use cases;
- the datasheet of these chips is usually available, and it is very precise;
- the description of a machine is simpler, because it consists to write down the relation between the chips, in terms of inputs and outputs, much like the chips were wired together on the PCB;
- many machines can be emulated by this method, without much effort, even if they differ only by a few details;
- the hardware bugs that existed in the original hardware are emulated in MAME as well, and this is a feature, not a bug.
I was in school when these computers appeared on the market, and the school bought one in 1985, that has been my first contact with a real computer, a box engineered by a french company, Thomson, built around a Motorola MC6809 8-bits processor.

Someone wrote a machine description in MAME, and the fun begins:

MAME debug window screenshot
- MAME includes a graphical debugger that operates at the level of the emulated machine, with all features of a modern debugger. It allows to inspect all aspects of the emulated hardware: the registers of the emulated CPU, and RAM and ROM banks and layout, the video memory, the possibility the annotate the dissassembly of the ROMS, the possibility to set breakpoints, conditional breakpoints, watchpoints, the possibility to have a live memory coverage of the code executed by the CPU. MAME is helped by the limited resources that were at play in these machines: the CPU is clocked at 1MHz, 64KB of addressable memory, meaning a 16-bits space address range;
- Let’s consider another example: these machine use generally a magnetic audio tape as the sole way to permanently store and retrieve data. The data is handled in analog format, with some very simple encoding, for example using frequency modulation to encode a zero or a one, aka MFM. When a classic emulator would put a hook in the entry point of the monitor in charge of reading from the cassette device, and replace the original code with a function reading directly from a file on the host machine, MAME prefers to simulate a cassette device, with its underlying tape media: the cassette device has capabilities to play, pause, record, rewind, eject, and it is associated with a tape media, typically an audio file in WAVE format on the host machine. The bonus of this low-level emulation is that it makes it possible to work with original tapes, after a conversion step to an audio file, with some HiFi equipment. People having used this technology will remember that it was not very fast (nor reliable) to read a program stored on such a tape, something like 900 bauds. Several minutes were needed to load a big program filling all the available memory. The emulation in MAME reproduces this precise timing, and reading from the emulated cassette device is not faster today. The only difference is that MAME can make the emulator run faster as a whole.
I would have dreamed of all these possibilities back in these days, where documentation was sparse, experiments were slow, repetitive and error prone.
February 26, 2016
Self hosting
What is possible to host locally on a linux server, that would otherwise be delegated in the silo of big corporations? What is worth the effort? In all these cases, the underlying server will have to run 24/7, not necessarily a powerful box.
- mail server: yes, of course this is the first and most straightforward service that should stay under the control of the user. Because of the sensitivity of the information it handles. Setting up a mail server is not just configuring postfix or sendmail, but also an IMAP server, an anti-spam filter running server-side, a greylist server, a webmail;
- dns server: the control of the DNS for the zone hosting your services is required, because the DNS tells the Internet what is the address of the box running your mail server, the box running your web server, and all other services you may host yourself;
- web server: setting up an apache web server on your box is the entry point to host all other kinds of web services, so even if you don’t plan to write pages in HTML yourself, you’ll have to run a web server. The limitation in the services that you’ll be able to host depends on the upload bandwidth of you ISP, considering that even modest audio files may slow down considerably your network connection if they are downloaded by several clients simultaneously. Apache is not well adapted to do traffic shaping, and doing it at the linux level is difficult to tune correctly. Upload capability is often limited in favor of download speed, this is the meaning of A as Asymetric in ADSL, and usually not well advertised by ISP;
- XMPP/jabber server: hosting a jabber server for one’s own use is interesting, because you keep your list of contacts under your control. It’s a low bandwidth service, because most traffic contains signaling messages. If you want video and audio chat capabilities, you may want to install a STUN or a TURN server.
- identity provider: running an OpenID server is relatively simple on the top of apache, and it allows you to reclaim you identity with web services that allow this kind of delegation, ad OpenStreetMap does for example;
- firefox sync: it used to be very simple to host a sync server before the introduction of Firefox Sync 1.5. It requires now more components that run server-side, so it is clearly more difficult, even if not all component are required to run locally, and it is possible to setup an hybrid configuration, where only the sync service runs locally, and it still relies on the remote Firefox Accounts service hosted by Mozilla. However keeping sensitive information like browsing history locally is certainly worth the effort, even if the raw data is never accessible in the clear, even when stored on Mozilla servers.
- blogging platform: setting up a blogging platform like this WordPress instance is really simple, and doesn’t require much efforts (compared to the firefox sync setup). It is a very classical LAMP application, requiring a database backend, some scripting language and a web server;
- file sharing, dropbox replacement: setting up OwnCloud to share files is very simple too. You can use it to centralize your calendars and your contacts lists, in addition to share you files. For people using evolution, it may be interesting to setup a syncevolution server too.
Having a local server under the control of the user, and running all these software may be challenging, due to the initial configuration steps required to have a working setup. I think this is worth the effort. Having all these services locally hosted doesn’t restrict the way to communicate with other people, because most of these services rely on a decentralized architecture, or at least can work this way. The last point that remains to be tackled is to stay up to date with all these software installed, to install security updates as soon as they are available on your linux distribution, because each service that run locally is a potential attack target that can compromise your server. Keep your running services up to date and you should be safe.
February 26, 2016
Video chat using standard protocols
Until recently, I have always been disappointed by the less-than-optimal possibilities to have audio and video chat between two linux users, compared to the ease of use of proprietary solutions. Each component is available, but the glue required to make them work together was missing, of not properly configured, at best. The number of involved components on a typical linux desktop may be impressive:
- A working gstreamer environment, able to capture audio and video from a local source, capable to negotiate, and to build a pipeline with a set of video and audio codecs that will be compatible with the codecs of the peer. Remember that not all distributions provide the same set of codecs. Some will handle H264 or MPEG2 in hardware with the assistance of the graphical card, some will handle them in software with ffmpeg. The gstreamer pipeline must also provide the element that will establish and maintain the conference session;
- A way to exchange out-of-band information about network topology, for example an account on an XMPP (jabber) server, with the convenient extensions. This information is required to establish the network connection for the data streams at the lower level;
- A high-level graphical application, that hides most of this stuff to the end user, pidgin or empathy for example;
- A library that implements the low level protocol used to make a stream of data flow in both directions between the peers. This is another tricky part because of the wide range of network topologies particularities. Network NAT used in most home routers, or broadband network, typically hides private non-routable IPv4 addresses behind a single public IPv4 address, that imposes challenges inherent to this technology: new connections can be established only from the inside to the outside of the NAT, normally… The same restriction applies to a firewall, that generally trusts the internal network, and allows connections to be initiated only from the inside to the outside.
Some RFC describe possibilities to overcome these latter limitations in interesting ways. RFC 5245 (ICE) is one of them, built above other RFC (STUN, TURN). The proposed ideas are simple, but require some assistance and synchronization:
- A client wanting to establish a network connection with a peer must have a way to discover its public IP address, if it is located behind a NAT. This is achieved with the help of a STUN server, whose goal is just to reply to the client requests, informing it back with the IP address the request came from.
- A trick used with certain classes of NAT to allow to access a box inside a private network is the UDP hole punching method, that makes the client inside the NAT emit an UDP datagram first, just to create the association in the NAT, so the real initial datagram from the outside will look like a reply to the previous outgoing datagram. This is not a magic bullet, and it may fail when both peers are not properly synchronized in the way they send their initial UDP datagrams. The use of the out-of-band connection to a third party XMPP server helps to synchronize them. It may also fail if the NAT does not preserve the IP:port association between consecutive outgoing connections (symmetric NAT), because in this case the client inside the NAT has no way to provide this association to its remote peer. Linux NAT with iptables for example does its best to preserve the association by default, see the –random option in the SNAT rule for details.
- When a direct connection cannot be established with the methods described previously, a fallback alternative that is expected to work in most cases, but at the cost of more network latencies, it to use an external relay server (a TURN server), that will be accessed with an outgoing connection by both peers, so without much risk of being blocked.
The role of the RFC 5245 is to describe how each client wanting to establish a connection will try each of these methods, in turn, in a synchronized and prioritized way, testing the easiest direct connection first, and falling back to the expensive relay connection last. With all this infrastructure in place, one finally can give the user with an efficient and reliable way to establish a video chat with a peer, without relying too much on a hostile network environment.
February 26, 2016