Ubuntu 16.04 and sudo Monday, 13 June 2016  
Not sure whether to laugh or cry. Ubuntu 16.04 is now the latest release. I grabbed a server ISO and decided to install - so I could do some dtrace fixes for the Linux 4.4 kernel.

I go through a routine of a base install, get my .profile/.bashrc, bin/ and various other things to make the VM my "home".

First thing is to create my login id as part of the install. Alas, over the years, my UID has been set out of bounds during an install. To make life easier, I ensure I have the same UID across all systems - despite it not really being relevant. But it helps.

First thing: sudo to root, vi /etc/passwd and change my UID to the preferred one. Then "chown -R fox ." on my home dir, logout, and login again. All is ready.

Alas, sudo has a naughty bug in it. Since I have my UID==GID, I also need to edit /etc/group.

Guess what I forgot to do? Yes. Forgot to edit /etc/group.

So now what? Well, "sudo" nicely core dumps on you when your group doesnt match. WTF?! Its 2016 !

Oh dear, am locked out of becoming root with out a reinstall.......

Ok, so lets boot to single user mode, vi /etc/group, and carry on.

Whew! All is resolved.


Posted at 22:13:03 by fox | Permalink
  Python & Valgrind & CRiSP Sunday, 11 October 2015  
Been adding a native python interface to CRiSP, and needed to run valgrind on my code. Alas, Python itself seems to be polluted with many undefined memory read scenarios (this is python2.7). A simple python script, when run under valgrind gives rise to errors like:

==17795== Invalid read of size 4
==17795==    at 0x4A30B1: ??? (in /usr/bin/python2.7)
==17795==    by 0x4E05A5: ??? (in /usr/bin/python2.7)
==17795==    by 0x4C5BB1: ??? (in /usr/bin/python2.7)
==17795==    by 0x4B422E: ??? (in /usr/bin/python2.7)
==17795==    by 0x4B3DFA: ??? (in /usr/bin/python2.7)
==17795==    by 0x4B3642: ??? (in /usr/bin/python2.7)
==17795==    by 0x4B6755: ??? (in /usr/bin/python2.7)
==17795==    by 0x4D437A: PyEval_CallObjectWithKeywords (in /usr/bin/python2.7)
==17795==    by 0x4CF3B0: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==17795==    by 0x4CB6B0: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==17795==    by 0x4CAF55: PyEval_EvalCode (in /usr/bin/python2.7)
==17795==    by 0x4C97BB: PyImport_ExecCodeModuleEx (in /usr/bin/python2.7)     ==17795==  Address 0x6046020 is 34,480 bytes inside a block of size 36,521 free'd
==17795==    at 0x4C2CE10: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==17795==    by 0x4C6033: PyMarshal_ReadLastObjectFromFile (in /usr/bin/python2.7)
==17795==    by 0x4C5F6D: ??? (in /usr/bin/python2.7)
==17795==    by 0x4C5B4B: ??? (in /usr/bin/python2.7)
==17795==    by 0x4B422E: ??? (in /usr/bin/python2.7)
==17795==    by 0x4B3DFA: ??? (in /usr/bin/python2.7)
==17795==    by 0x4B3642: ??? (in /usr/bin/python2.7)
==17795==    by 0x4B6755: ??? (in /usr/bin/python2.7)
==17795==    by 0x4D437A: PyEval_CallObjectWithKeywords (in /usr/bin/python2.7)
==17795==    by 0x4CF3B0: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==17795==    by 0x4CB6B0: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==17795==    by 0x4CAF55: PyEval_EvalCode (in /usr/bin/python2.7)

I havent looked at the python internals to see what and how dangerous this is.


Posted at 21:42:43 by fox | Permalink
  Intel NUC...rhymes with ... Thursday, 18 June 2015  
Alas, my Intel NUC decided to die. Our electric meter needed replacing and after a power on - it wouldnt boot. It was suffering amnesia again and refused to find the boot record on the HD. Shame, because I really liked the machine, but wasnt going to play around to see if the boot record had been corrupt or fiddle around with yet another BIOS update. Shame on Intel for releasing the Celeron NUC and letting it be a poor player.

So, lets look for another machine. Intel-on-a-stick. Thats more like it. But the price is excessive, and Intel lost my trust.

Off to a Raspberry Pi2. Having played with the first RaspberryPi and being let down by the very bad power supply sensitive issues, I thought I would try again. I must say, so far, am enjoying it. The NOOB SD card just works - so much better than any other Linux - it installed flawlessly (despite the confusing initial boot/install screen), and TCP worked on the ethernet.

So, the rss feed (http://www.crispeditor.co.uk:3000) is back up and I just need to copy the crisp website and releases over (as soon as the 128GB USB drive arrives), and should be back in business.

The Pi2 is impressive - 4 core 1GHz ARM processor, but a nuisance as I need to recompile a few binaries over to ARM. At least I can create a new CRiSP release for Pi, and revalidate the ARM dtrace port.

Stay tuned for the crispeditor web site to resurface.

For now the RSS feed supply is a bit dry and empty, but should be a full page of news within 12-24h.


Posted at 23:02:59 by fox | Permalink
  New dtrace release -- mostly cosmetic Wednesday, 06 May 2015  
Pushed out a new dtrace release; this includes domain name changes and one cosmetic compiler fix for some systems. It wont fix/enhance most peoples issues, but gives me a decent baseline to compare against.

Posted at 20:33:38 by fox | Permalink
  New website - stop it ! Sunday, 03 May 2015  
The new website www.crispeditor.co.uk is live, and its interesting to watch who is accessing it. Right now, there are two main attempts at cracking it - one is a WordPress vulnerability and the other, is a PHP one. I guess the bots are hungry for food.

Also, it is interesting how quickly both Google and Bing start their bot machines. That is encouraging. Very soon, I will be the most popular website in the solar system - just keep clicking and we might make it into the Interweb Hall of Fame.


Posted at 20:28:01 by fox | Permalink
  http://www.crispeditor.co.uk - now live Saturday, 02 May 2015  
The domain transfer and contents transfer is complete - the links on the website can now be used to download copies of CRiSP, DTrace and other tools. I will need to tidy up some of the very old contents.

Please mail me at crispeditor at gmail.com if you see any issues.


Posted at 09:19:31 by fox | Permalink
  http://www.crispeditor.co.uk Thursday, 30 April 2015  
The new replacement and home for the very old Demon website, where you can download copies of CRiSP is nearly ready. You can visit the website at

http://www.crispeditor.co.uk

and you will find content and download links, but the links will be a bit stale as I copy things around. Hope to have this at parity in the next day or so.

You may not be able to actually download anything, but thats high on the list to fix.


Posted at 23:18:36 by fox | Permalink
  Goodbye Demon - shame the service is so bad Sunday, 26 April 2015  
I joined Demon Internet back in 1992 - the first service provider in the UK, and home for CRiSP (www.crisp.demon.co.uk) for the last 23y.

Despite various ownership changes of the company, it seems that the FTP service is no longer functioning and I will finally retire my Demon account this year.

For those of you looking to find CRiSP, you will find that eventually that website will vanish, and I will replace with another site - once I have decided on the best options.

My gmail accounts will continue to work, but my demon email address will stop working at some point.

This is just a heads up.

I hope I am wrong and the FTP service recovers, but it seems like Demon are transferring the hosting services to another provider, and nobody at Demon understands the product they own, so it appears.


Posted at 21:32:27 by fox | Permalink
  another dtrace delay Sunday, 01 March 2015  
Everything was looking promising to release a new dtrace sometime last week. It was working on the 3.16 kernel, 3.8, 3.4 and then onto RH5.6 (2.6.18). I ran into a lot of issues on 2.6.18 - not surprising, given the code mutations. Much of the last 2 weeks was on the execve() system call. It would panic the kernel. Despite a lot of experiments and reading of the assembler and kernel code, I kept doing silly things. It really doesnt help that the 2.6.18 kernel will hard panic on a stray GPF - made it very difficult to figure out what was going on.

Eventually I got every line of assembler and issues with registers in C code to work.

Along the way I had an issue with the "old_rsp" symbol. This is not exposed in /proc/kallsyms, and not even in the /boot/System.map code. I had to write a tool to extract this from inside the kernel. But this ran into complications because /proc/kcore is broken on the RH/Centos kernels. I had to create a new device driver, which has to be loaded into the kernel prior to the build of dtrace ("/proc/dtrace_kmem"). Its a very simple driver only designed to handle the scenario of building dtrace.

Having got this work, then the next roadblock was the rt_sigreturn() syscall which paniced the kernel. Careful investigation showed a missing line of assembler (for the 2.6.18 kernel). Now that works.

Now everything is looking good on RH5/Centos5 but before going on the trawl of later kernels and proving I didnt break anything, I have an issue with x_call.c. Either I use the native smp_call_function() interface - which works great, until we panic the kernel, or I use my implementation, which doesnt seem to be broadcasting to the cpus - this means certain probes get "lost".

So, hopefully this week or next weekend - depending on the xcall issues.


Posted at 21:06:30 by fox | Permalink
  dtrace update ... Monday, 23 February 2015  
Still delaying the dtrace release. Having gotten 3.16 kernels to work, I started working backwards on random 3.x kernels, to validate it still worked there. I fixed a number of issues there, and then headed into RedHat 5.6 / Centos 5.6 land (2.6.18+ kernel).

I spent some time trying to get execve() syscall tracing to work - and am still working on that.

Along my journey, I noticed a few things. Firstly dtrace4linux is too complicated - trying to support 32+64b kernels, along the entire path back to 2.6.18 or earlier, is painful. I cannot easily automate regression testing (not without a lot more hard-disk space, and not worthwhile whilst I am aware of obvious bugs to fix). I could simplify testing by picking any release, and just rebooting with different kernels - rather than full ISO images of RedHat/Centos/Ubuntu/Arch and so on.

I also noticed that the mechanism dtrace4linux uses to find addresses in the kernel is slightly overkill. It hooks into the kernel to find symbols which cannot be resolved at link time. The mechanism I have is pretty interesting - relying on a Perl script to locate the things it needs. I found a case where one of the items I need is not visible at all in user space - its solely in the kernel - part of the syscall interrupt code (the per-cpu area). Despite what latest kernels do, some older kernels *dont*. And catering for them is important. In one case I have had to go searching the interrupt code to find this value. I ended up writing a C program to run in user space, prior to the build, and really, it would have been better to generalise this so that everything we need is simply defined in a table compiled in to the code, rather than the /dev/fbt code to read from the input stream. This would ensure that a build compiles and works. Today, sometimes I debug issues with old kernels because a required symbol is missing and we end up dereferencing a null pointer (not a nice thing to do in the kernel).

One problem I had with the above, was that gdb on the older distro releases cannot be used to read kernel memory due to a bug in the kernel precluding reading from /proc/kcore. Fortunately, I include a script in the release which emits a vmlinux.o, complete with symbol table, from the distribution vmlinuz file.

I havent reverified the ARM port of dtrace, but thats something for a different rainy or snowy day.


Posted at 21:48:32 by fox | Permalink
  new dtrace .. small update Friday, 20 February 2015  
The next release of dtrace is hopefully this weekend. Having resolved the issues I had previously, have been doing more testing - so far only really on the 3.16 kernel, and found that some of the syscalls were behaving badly due to reimplementation in the kernel. Hopefully when I have fixed the last two or three, then I can finish my merges and push out the latest release. I will do a cursory check on some of the older kernels - it is likely I have made a mistake somewhere and broken older kernels, but will be easier to fix having made some internal changes.

Note that no new functionality is in here - the issues with libdwarf remain - I may try again to solve that issue, and "dtrace -p" is still a long way off from being functional.

Given that 3.20 is now the current kernel, I may need to see if that works and pray that 3.17-3.20 didnt affect how dtrace works, or, if it does, the work to make it compile should be much less than the issues that 3.16 raised.


Posted at 18:07:51 by fox | Permalink
  Why is gcc/gdb so bad? Thursday, 19 February 2015  
When gcc 0.x came out - it was so refreshing. A free C compiler. GCC evolved over the years, got slower and used more memory. I used to use gcc on a 4MB RAM system (no typo), and wished I had 5MB RAM. Today, memory is cheap, and a few GB to compile code is acceptable. (The worst I have seen is 30+GB to compile a C++ piece of code - not mine!)

One of the powerful features of gcc was that "gcc -g" and "gcc -O" were not exclusive. And gdb came about as a free debugger, complimenting gcc.

Over recent years, gdb has become closer to useless. It is a powerful and complex and featureful debugger. But I am fed up single stepping my code, and watching the line of execution bounce back and forth because the compiler emits strange debug info where we move back and forth over lines of code and declarations.

Today, in debugging fcterm - my attempt to place a breakpoint on a line of code, puts the breakpoint *miles* away from the place I am trying to intercept. This renders "gcc -g" close to useless, unless I turn off all optimisations, and pray the compiler isnt inlining code.

Shame on gcc. Maybe I should switch to clang/llvm.


Posted at 23:05:06 by fox | Permalink
  address: 0000f00000000000 Saturday, 14 February 2015  

Strange. Continue to keep finding why dtrace is not passing my tests. I have narrowed it down to a strange exception. If the user script accesses an invalid address, we either get a page fault or a GPF. DTrace handles this and stubs out the offending memory access. Heres a script

build/dtrace -n '
        BEGIN {
               cnt = 0;
               tstart = timestamp;
        }
        syscall::: {
               this->pid = pid;
               this->ppid = ppid;
               this->execname = execname;
               this->arg0 = stringof(arg0);
               this->arg1 = stringof(arg1);
               this->arg2 = stringof(arg2);
               cnt++;
        }
        tick-1s { printf("count so far: %d", cnt); }
        tick-500s { exit(0); }
'

This script will examine all syscalls and try and access the string for arg0/1/2 - and for most syscalls, there isnt one. So we end up dereferencing a bad pointer. But only some pointers cause me pain. Most are handled properly. The address in the title is one such address. I *think* what we have is the difference between a page fault and a GPF. Despite a lot of hacking to the code - I cannot easily debug, since once this exception happens the kernel doesnt recover. I have modified the script above to only do syscall::chdir: which means I can manually test via a shell, doing a "cd" command. On my 3-cpu VM, I lose one of the CPUs and the machine behaves erratically. Now I need to figure out if we are getting a GPF or some other exception.

I tried memory addresses: 0x00..00f, 0x00..0f0, 0x00..f00, ... in order to find this. I suspect there is no page table mapping here or its special in some other way. May need to dig into the kernel GDT or page table to see what is causing this.

UPDATE: 20150215

After a bunch of digging I found that the GPF interrupt handler had been commented out. There was a bit more to this than that, because even when I re-enabled it, I was getting some other spurious issues. All in all, various bits of hack code and debugging had got in the way of a clear message.

I have been updating the sources to merge back in the fixes for the 3.16 kernel, but have a regression on syscall tracing which can cause spurious panics. I need to fix that before I do a next release.


Posted at 10:28:01 by fox | Permalink
  no dtrace updates Monday, 09 February 2015  
People have been questioning why there are no dtrace updates. I hope to be in a position to properly respond shortly. Just before Christmas, I started work on Debian Jessie (3.16 kernel) and hit a number of issues. Although I made good progress fixing issues on x32 syscalls on a x64 system, and systematically fixing other issues, I had to hack the driver tremendously. These hacks are experiments to figure out why I could so easily crash the kernel. The usual means of panicing the kernel did not hold - normally a stray issue causes a kernel message and I can debug around the issue to isolate the cause.

The issues I hit were all very low level - the cross-cpu calls, the worker interrupt thread, and the current issue - relating to invalid pointers when accessed via a D script. I have a "hard" test which wont pass without crashing the kernel - crashing the kernel really hard, requiring a VM reboot. This is nearly impossible to debug. The first thing I had to do was increase the console mode terminal size - when the panic occurs, the system is totally unresponsive and all I have is the console output to look out, with no scrolling ability. Having a bigger console helps - but it seems like the GPF or PageFault interrupt, when occuring inside the kernel, does not work the same way as it has on all prior Linux kernels. Looking closely at the interrupt routines shows some changes in the way this works - enough to potentially cause a paniccing interrupt to take out the whole kernel; this makes life tough to debug.

If I am lucky, the area of concern is related to the interrupt from kernel space. If I am unlucky, it is not this, but something else. (Am hypothesing that the kernel stacks may be too small).

I have been saving up putting out any updates, despite some pull requests from people, because I am not happy the driver is in a consistent state to release. When I have finished this area of debugging, I can cross-check the other/older kernels, and see if I have broken anything.

It is very painful dealing with hard-crashing kernels - almost nothing helps in terms of debugging, so am having to try various tricks to isolate the instability. These instabilities in theory, exist on other Linux releases - but I will only know when I have gotten to the bottom of the issue.


Posted at 23:02:06 by fox | Permalink
  DTrace & Debian/Jessie Monday, 01 December 2014  
Someone reported a bug in dtrace whereby execve() wasnt tracing. I created a VM and started testing, and can confirm this. Looks like dtrace is getting confused by the new rewritten syscall assembler. I have a working version for this in my testbed, but I found that changes to the IPI code in the kernel are making any dtrace probes extremely unreliable - looks like a 1:N chance of seeing output (where N is the number of cpus you have).

I have some similar issues in Ubuntu 14.04 - hopefully similar issues.

Hope to have a new release shortly in a few days.


Posted at 22:54:21 by fox | Permalink
  CRiSP/crtags Optimisation Saturday, 08 November 2014  
The tagging facility in CRiSP has always worked reasonably well - it is based on a series of custom parsers for each programming language. Over the years, the list of languages supported has grown and is now a huge repository of nearly every common language and file format out there. (See: "crtags -help" for details).

The initial implementation is getting on for nearly 20y old. The goal originally was to provide the "ctags" facility of vi, but better. Machines of the day were looking like 4-16MB of RAM rather than 4-16GB of RAM which is common today, so effort was made to optimise space usage. The crtags file format is a series of sections of files and items which are tagged. It has been optimised - right from the beginning, to avoid optimise space usage, and avoid bad paging behavior. (Now, a distant artifact ! Systems rarely page or are memory constrained). This attempt to optimise memory goes back to the 1MB machines that CRiSP was originally built on. These optimisations are no longer necessary - but removing them would only offer a small change in performance.

crtags is designed to work with reasonably sized projects and directories. It takes a few seconds to scan the nearly 8000 files in the CRiSP source tree, and I regularly use it on the Linux kernel. The 3.16.1 kernel has 47426 files in it. Scanning that takes a little while. I have benchmarked this over the years.

Recently I did some more work to look at the performance and optimisation facilities in crtags. I collected a series of Linux kernels - so I would have a good/large test case - about 500MB of source files, 67356 files in all. On my i7 laptop (crtags is single threaded), it was taking about 2m20s to scan the files. On investigating the performance, I could see that we had an O(n^2) algorithm on filename matching. This is silly, given the complexity of the language parsers - that the mere filenames were using a lot of the CPU.

I modified the code to put in a hash table for filename matching, and this gave a huge win - down to about 23s for scanning the same files - about a 7x improvement in speed.

In looking at crtags, most of the processing is a constant per file - and each file is handled, one after the other. This opens up a huge win by multithreading the code. Potentially an Nx speedup, on an N cpu system. The code is difficult to convert to multithreading - it would require a lot of edits and refactoring, to ensure each thread is avoiding global state - a common reason why converting a non-threaded application to multithreaded is so difficult.

Its depressing how little attention the C standards bodies and compiler writers have, for converting non-threaded code to threaded. Really, there are a set of transformations (refactoring) and one would think that tools could help identify the major areas at issue (use of global variables, and use of "static" variables). I may create a refactoring macro in CRiSP to handle this.

On Unix, using a fork/join model of operation, one can create the equivalent of a multithreaded code, by use of fork() and wait() system calls. For example, divide all the files to be processed up, into separate groups, processed by individual CPUs. Then the issue of global state and locking disappears - at the expensive of more work on the "join" or merge at the end of processing.

I have modified crtags to use this fork/join model (it is a command line option, and not enabled by default), and reran my test. The above test went from 23s to 4s by using "crtags -j 8" - using the 4 real and 4 hyperthread cpu's on my i7. About a 6x performance increase. (The final code will be slower due to the lack of a merge).

So we went from 2m20s to 4s - a 35x speed up, with just a handful of lines of code.

The depressing thing about Windows (and this is late 2014) is that it still does not support fork(). It does support threads. So the above code will have no effect on Windows, and real threading will have to be implemented or an alternate way of achieving the same result.

I do fail to understand why Windows cant implement fork() - from the user space point of view, there is almost nothing to implement. From the kernel point of view, its not a huge amount of code. Granted, Windows processes may carry more state and forking may be more expensive if Windows could do it, but that would be such a big benefit when writing portable code or porting code to Windows. Oh well. (Cygwin supports fork(), and it is hugely expensive in operation, since it relies on software to copy huge blocks of memory around, rather than relying on the CPU's MMU to do copy-on-write (COW) operation - the key to why fork() is so efficient on Unixes).

Having said that, fork() on Linux is not brillliantly fast. Despite processors being so much faster than years of old - fork() seems to be getting slower, possibly related to the need for all CPUs to synchronise MMU and other state, rather than fork() itself getting more complicated.

To end users of CRiSP, they may see the initial performance optimisation, but unless they are working on extremely huge projects, they may hardly ever notice the change put in place. Also note, that in CRiSP, one rarely tags an entire project - CRiSP does incremental updates to the tag database, as you are editing/saving files.


Posted at 23:38:10 by fox | Permalink