<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0">
<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Sherlock and Symtab Library</title>
</head>
<body bgcolor="#C0C0C0">
<center><h1><font color="Blue">Sherlock and Symtab Library</h1></center>
<font color="#102020">
<p>Sherlock, and the Symtab library upon which it is based, are intended
as diagnostic tools to aid software development on RISC OS, particularly
for debugging awkward failures on released software/OS builds.</p>

<p>The linker included in the Norcroft toolchain has always had the capacity
to emit a list of symbol addresses/offsets alongside the generated binary
code and this can be invaluable when mapping the raw address of a failure
into a location within the source/data of a program. This has always been
a laborious manual process, however, which the Symtab library aims to
simplify greatly.</p>

<p>The library concerns itself with the loading, parsing and searching of
a number of symbol tables, potentially one for each and every loaded
RISC OS relocatable module, including the ROM modules, and one for the
application code itself. Then, from this set of symbol tables is may
be used to look up the address of a failing instruction/memory access
and map that address into a symbol triplet of 'module name', 'symbol name'
and 'address offset.' Without additional debugging information built into
the failing binary, this is about as far as the mapping can really proceed,
but it is usually enough to locate the line/datum in question very easily.</p>

<p>Symtab is deliberately designed to be self-contained using only a bare
minimum of ISO C run-time library functions, with callbacks into the client
code for any required complex functionality, so that it may easily be
incorporated into very low-level code and used+ in circumstances when the
C runtime may not be (fully) usable.</p>

<p>Sherlock is a RISC OS relocatable module that employs the Symtab library
to provide a number of *commands which are rough parallels of those provided
by the standard Debugger module, with the addition of symbolic information:</p>

<b>
<ul>
  <li>*SymLoad &lt;file | dir&gt; [&lt;file | dir&gt; ...]</li>
  <li>*SMemory [B | H | D] &lt;addr1 | reg1&gt; [[+|-] &lt;addr2 | reg2&gt;]</li>
  <li>*SMemoryI ...&lt;addr1 | reg1&gt; [[+|-] &lt;addr2 | reg2&gt;]</li>
  <li>*SMemoryS &lt;mode | addr1 | reg1&gt; [[+|-] &lt;addr2 | reg2&gt;]</li>
  <li>*SymTables [&lt;table&gt; [&lt;table&gt;...]]</li>
</ul>
</b>

<h2>Example usage of the above commands:</h2>

<p>Load a symbol file called 'tsym' into memory, naming the table 'test'
(in this case an application).</p>

<b>*SymLoad test##tsym</b>
<pre>Loaded symbol tables for test from 'tsym'</pre>

<p>List information on the symbol table(s) that have been loaded. Information
on specific tables, rather than all loaded tables, may be requested by specifying
the table name(s) on the command-line. A '-V' parameter lists verbose information
including all of the symbols within the table.</p>

<b>*SymTables</b>
<pre>
Symbol table 'test' at c90d9dec
  3 blocks
 Block 0 (ReadOnly): 0x8080 to 0xCDF4
  476 symbols
 Block 1 (ReadWrite): 0xCDF4 to 0xCE70
  6 symbols
 Block 2 (ZeroInit): 0xCE70 to 0xDF74
  77 symbols
</pre>

<p>To produce a symbolic disassembly of an address range, the *SMemoryI command
parallels the standard *MemoryI command of the Debugger module, supporting the
same syntax, but the first address may also be symbolic, as showing in the second
example below.</p>

<b>*smemoryi 817c + 40</b>
<pre>
0000817C : E3550000 : file_close+0x48                  : CMP     R5,#0
00008180 : 11A00005 : file_close+0x4C                  : MOVNE   R0,R5
00008184 : E91BA870 : file_close+0x50                  : LDMDB   R11,{R4-R6,R11,R13,PC}
00008188 : 0000CC00 : file_close+0x54                  : ANDEQ   R12,R0,R0,LSL #24
0000818C : 00000066 : file_close+0x58                  : ANDEQ   R0,R0,R6,RRX
00008190 : 656C6966 : file_close+0x5C                  : STRVSB  R6,[R12,#-2406]!
00008194 : 7465675F : file_close+0x60                  : STRVCBT R6,[R5],#-1887
00008198 : 00000073 : file_close+0x64                  : ANDEQ   R0,R0,R3,ROR R0
0000819C : FF00000C : file_close+0x68                  : Undefined instruction
000081A0 : E1A0C00D : file_gets                        : MOV     R12,R13
000081A4 : E92D000F : file_gets+0x4                    : STMDB   R13!,{R0-R3}
000081A8 : E92DDBF0 : file_gets+0x8                    : STMDB   R13!,{R4-R9,R11,R12,R14,PC}
000081AC : E24CB014 : file_gets+0xC                    : SUB     R11,R12,#
000081B0 : E15D000A : file_gets+0x10                   : CMP     R13,R10
000081B4 : 4B000FB5 : file_gets+0x14                   : BLMI    __rt_stkovf_split_small
000081B8 : E1B04001 : file_gets+0x18                   : MOVS    R4,R1
</pre>

<b>*smemoryi area_names + 40</b>
<pre>
0000CE20 : 00008454 : area_names                       : ANDEQ   R8,R0,R4,ASR R4
0000CE24 : 00008460 : area_names+0x4                   : ANDEQ   R8,R0,R0,ROR #8
0000CE28 : 0000846C : area_names+0x8                   : ANDEQ   R8,R0,R12,ROR #8
0000CE2C : 6E6E553C : area_names+0xC                   : MCRVS   CP5,3,R5,C14,C12,1
0000CE30 : 64656D61 : area_names+0x10                  : STRVSBT R6,[R5],#-3425
0000CE34 : 3028203E : area_names+0x14                  : EORCC   R2,R8,R14,LSR R0
0000CE38 : 58585878 : area_names+0x18                  : LDMPLDA R8,{R3-R6,R11,R12,R14}^
0000CE3C : 58585858 : area_names+0x1C                  : LDMPLDA R8,{R3,R4,R6,R11,R12,R14}^
0000CE40 : 00002958 : area_names+0x20                  : ANDEQ   R2,R0,R8,ASR R9    ; *** Not R8-R14
0000CE44 : 00000000 : area_names+0x24                  : ANDEQ   R0,R0,R0
0000CE48 : 00000000 : area_names+0x28                  : ANDEQ   R0,R0,R0
0000CE4C : 00000000 : test_handle                      : ANDEQ   R0,R0,R0
0000CE50 : 0000B2A0 : fn                               : ANDEQ   R11,R0,R0,LSR #5
0000CE54 : 0000B390 : fn+0x4                           : Undefined instruction
0000CE58 : 00008090 : fn+0x8                           : Undefined instruction
0000CE5C : 00008134 : fn+0xC                           : ANDEQ   R8,R0,R4,LSR R1
</pre>

<p>To load the symbols for a relocatable module, it is useful to specify the
name of the module as a prefix to the filename. For example, the following
command instructs the Sherlock module to load its own symbol table from the
file 'sym' When the table is loaded, Sherlock will check for the presence of
a loaded Relocatable Module with the given name, and thus maps the offsets
specified in the symbol file into absolute addresses. It will also do this if
the module is later loaded/reloaded, so the order in which the table and the
module itself are loaded does not matter.</p>

<b>*SymLoad Sherlock##sym</b>
<pre>
Setting base of 539800212 'ReadOnly' to 202CB294
Setting base of 539431508 'ReadWrite' to 20271254
Setting base of 539431712 'ZeroInit' to 20271320
Loaded symbol tables for Sherlock from 'sym'
</pre>

<p>The *SMemoryS command provides a crude backtrace/dump of the given
stack/address range. If a CPU mode is specified on the command-line,
rather than an address range as for the other *commands, the current
stack pointer for that mode is read and used as the start address.
Here is part of the output produced when *SMemoryS is called for the
SVC stack, and we can see that the Sherlock module is itself threaded
and its addresses appear on the Supervisor stack because it is
processing the *command. Clearly this is of limited utility at present,
and requires a SWI/lower-level interface to achieve its true potential.</p>

</p>

<b>*SMemoryS SVC</b>
<pre>
FA207F40 : 202745BD ->                                  : .E'
FA207F44 : FA208000 ->                                  : .. .
FA207F48 : FA207F40 ->                                  : @. .
FA207F4C : 202745BD ->                                  : .E'
FA207F50 : 00000001 ->                                  : ....
FA207F54 : 00000003 ->                                  : ....
FA207F58 : FB407C0C ->                                  : .|@.
FA207F5C : 23F60D4C ->                                  : L..#
FA207F60 : 23DBFF9C ->                                  : ...#
FA207F64 : 00000003 ->                                  : ....
FA207F68 : FFFFFFFF ->                                  : ....
FA207F6C : 00000000 ->                                  : ....
FA207F70 : 00000000 ->                                  : ....
FA207F74 : FA207F80 ->                                  : .. .
FA207F78 : 202CB478 -> Sherlock##__module_header+0x1E4  : x.,
FA207F7C : 202CE120 -> Sherlock##module_command+0xC     :  .,
FA207F80 : 202745BD ->                                  : .E'
FA207F84 : 00000053 ->                                  : S...
FA207F88 : FB407BF4 ->                                  : .{@.
FA207F8C : FC02389C ->                                  : .8..
FA207F90 : FFFFFFFF ->                                  : ....
FA207F94 : 202CB3F0 -> Sherlock##__module_header+0x15C  : ..,
FA207F98 : 202745B4 ->                                  : .E'
FA207F9C : 00000110 ->                                  : ....
...
</pre>

<b>*smemoryi file_close</b>
<pre>
202CB7D8 : E1A0C00D : file_close                       : MOV     R12,R13
202CB7DC : E92DD873 : file_close+0x4                   : STMDB   R13!,{R0,R1,R4-R6,R11,R12,R14,PC}
202CB7E0 : E24CB004 : file_close+0x8                   : SUB     R11,R12,#4
202CB7E4 : E15D000A : file_close+0xC                   : CMP     R13,R10
202CB7E8 : 4B00190F : file_close+0x10                  : BLMI    __rt_stkovf_split_small
202CB7EC : E1B06001 : file_close+0x14                  : MOVS    R6,R1
202CB7F0 : E1A04000 : file_close+0x18                  : MOV     R4,R0
202CB7F4 : 059F1030 : file_close+0x1C                  : LDREQ   R1,file_close+0x54
202CB7F8 : 024F2F11 : file_close+0x20                  : ADREQ   R2,file_open+0x88
202CB7FC : 028F0F0B : file_close+0x24                  : ADREQ   R0,file_close+0x58
202CB800 : 03A0303D : file_close+0x28                  : MOVEQ   R3,#
202CB804 : 0B001AB3 : file_close+0x2C                  : BLEQ    __assert2
202CB808 : E5960000 : file_close+0x30                  : LDR     R0,[R6,#0]
202CB80C : EB0016E4 : file_close+0x34                  : BL      xosfind_closew
202CB810 : E1A05000 : file_close+0x38                  : MOV     R5,R0
202CB814 : E1A01006 : file_close+0x3C                  : MOV     R1,R6
202CB818 : E1A00004 : file_close+0x40                  : MOV     R0,R4
202CB81C : EB00014A : file_close+0x44                  : BL      mem_free
202CB820 : E3550000 : file_close+0x48                  : CMP     R5,#0
202CB824 : 11A00005 : file_close+0x4C                  : MOVNE   R0,R5
202CB828 : E91BA870 : file_close+0x50                  : LDMDB   R11,{R4-R6,R11,R13,PC}
202CB82C : 202D279C : file_close+0x54                  : MLACS   R13,R12,R7,R2
202CB830 : 00000066 : file_close+0x58                  : ANDEQ   R0,R0,R6,RRX
202CB834 : 656C6966 : file_close+0x5C                  : STRVSB  R6,[R12,#-2406]!
</pre>


<h2>Using Sherlock now</h2>

<p>An in-progress development build - binary only for now, whilst I continue
working on the source and tidying a few loose ends - may be downloaded
<a href=../alpha/Sherlock.zip">here</a></p>

<p>For module code that fails but leaves the system sufficiently usable that
*commands may still be entered, it should be simple to use the Sherlock module
even in its current nascent state of development, since the code will
necessarily already be in memory.</p>

<p>To investigate a fault induced within application code will currently
require manual loading of the symbol table and application binary into
memory, eg. from a TaskWindow, issue *SymLoad &lt;symbol file&gt;, followed by
*Load &lt;executable image&gt;, bearing in mind that the binary will not
be executed, and must thus be a raw (not compressed/encrypted) copy of
the in-memory executable at the point of failure. If you have an utility
that will produce a copy of the application memory at the point of failure,
or grab the memory contents using your favourite source editor, then you may
choose to load that instead using a similar *Load command.</p>

<h2>Future Development</h2>

<p>Obvious next steps for the Sherlock module are to install exception
handlers which capture symbolic dumps/disassemblies/backtraces, and probably
to introduce SWI/direct interfaces to the routines which perform these
operations. It could also be beneficial to introduce calls from the ZeroPain
module into Sherlock, or the underlying Symtab library, so that non-faulted
accesses to zero page may be logged in a symbolic form whilst the application
continues running.</p>

<p>Please get in touch if you have any suggestions for further development of
the Sherlock module or the underlying Symtab library, to make it more useful
as a diagnostic/development tool. In due course it is my intention to release
all of the code as open source for the benefit of all developers, and so that
the library may readily be incorporated into other tools.</p>

<hr width="98%" />
<i>Copyright &copy; Adrian Lees 2015</i><br />
</font>
</body>
