Hive (part 2)
I’s continuation of previous post about Hive
. The first part focused on developers, environment and tools. This time I’m going to look deeper into code.
Erratum
In the first post about Hive
I wrote about accounts. Each commit contains information about author like: User #140 <Account.155@devlan.net>
. It looks
a bit strange, isn’t it? This unusual form niggled me. I couldn’t imagine that developers use only ID and talk about a code in this way: Did you see what
#140 made in foo.c? What a bummer! I made a deeper investigation and after few minutes all was clear. Data are malformed.
Someone from wikileaks
obfuscated data, small prove .git/logs/HEAD
:
b6ac4fd7cdeac91bb8425b9f9df49bda4411186b b6ac4fd7cdeac91bb8425b9f9df49bda4411186b fix <fix@wikileaks.org> 1508578973 -0400 checkout: moving from master to master
b6ac4fd7cdeac91bb8425b9f9df49bda4411186b b6ac4fd7cdeac91bb8425b9f9df49bda4411186b fix <fix@wikileaks.org> 1508579017 -0400 checkout: moving from master to b6ac4fd7cdeac91bb8425b9f9df49bda4411186b
b6ac4fd7cdeac91bb8425b9f9df49bda4411186b b6ac4fd7cdeac91bb8425b9f9df49bda4411186b fix <fix@wikileaks.org> 1508579044 -0400 checkout: moving from b6ac4fd7cdeac91bb8425b9f9df49bda4411186b to master
b6ac4fd7cdeac91bb8425b9f9df49bda4411186b 9a039f9fe67cefeed9042a8493b9e7f39f504c4c fix <fix@wikileaks.org> 1508579082 -0400 checkout: moving from master to armv5
9a039f9fe67cefeed9042a8493b9e7f39f504c4c b6ac4fd7cdeac91bb8425b9f9df49bda4411186b fix <fix@wikileaks.org> 1508579116 -0400 checkout: moving from armv5 to master
b6ac4fd7cdeac91bb8425b9f9df49bda4411186b 55de783e964f4cae02dc8530af8b24c3e5d93edd fix <fix@wikileaks.org> 1508579253 -0400 checkout: moving from master to mt6
55de783e964f4cae02dc8530af8b24c3e5d93edd 06e887de8037537af39ac5adf65c652536d6230f fix <fix@wikileaks.org> 1508579320 -0400 checkout: moving from mt6 to dhm
06e887de8037537af39ac5adf65c652536d6230f 1d93a9417e2bf263a73c019f7cd93c248e4f76b6 fix <fix@wikileaks.org> 1508579345 -0400 checkout: moving from dhm to debug
1d93a9417e2bf263a73c019f7cd93c248e4f76b6 1273b362f69f2ab0b4c4c7c25f3ccee65c0da0fd fix <fix@wikileaks.org> 1508579351 -0400 checkout: moving from debug to autotools
1273b362f69f2ab0b4c4c7c25f3ccee65c0da0fd 5c7b3f32dc88ed01d9c573bd4ef98158d614a66b fix <fix@wikileaks.org> 1508579380 -0400 checkout: moving from autotools to makemods
5c7b3f32dc88ed01d9c573bd4ef98158d614a66b 2b2e540698c7ffa760a1eae0fa6ecaf20a17ed30 fix <fix@wikileaks.org> 1508579390 -0400 checkout: moving from makemods to solarisbug
2b2e540698c7ffa760a1eae0fa6ecaf20a17ed30 46658be8101e799ef514516e2742c668a15ebe4b fix <fix@wikileaks.org> 1508579401 -0400 checkout: moving from solarisbug to ubiquiti
46658be8101e799ef514516e2742c668a15ebe4b 7450d4019017ba311c2b92167b21996dd447861c fix <fix@wikileaks.org> 1508579417 -0400 checkout: moving from ubiquiti to polar-1.3.4
7450d4019017ba311c2b92167b21996dd447861c b6ac4fd7cdeac91bb8425b9f9df49bda4411186b fix <fix@wikileaks.org> 1508846640 -0400 checkout: moving from polar-1.3.4 to master
It took more or less 3 days.
dirdival@pld:~$ python3
>>> import time
>>> time.gmtime(1508578973)
time.struct_time(tm_year=2017, tm_mon=10, tm_mday=21, tm_hour=9, tm_min=42, tm_sec=53, tm_wday=5, tm_yday=294, tm_isdst=0)
>>> time.gmtime(1508846640)
time.struct_time(tm_year=2017, tm_mon=10, tm_mday=24, tm_hour=12, tm_min=4, tm_sec=0, tm_wday=1, tm_yday=297, tm_isdst=0)
BTW, do you recognize the time zone? If not follow previous post. Did he make obfuscation well? Fortunately, he doesn’t know git like I do. Even if you remove or change something all data are still stored in git objects. I checked them all and I found hidden things.
Let’s now look under the hood:
dirdival@pld:/tmp/hive$ git cat-file -p 54bb023b7b95908455dddbc1b4ca984cd484e095
tree 8392a8140d924018067ef4ece3c2260c7427784d
parent 0265ca26c2c7f1765ab7c87b60a06a8a5587d2bd
author User #140 <Account.155@devlan.net> 1377009112 -0400
committer User #140 <Account.155@devlan.net> 1377009112 -0400
Added files and directories related to Eclipse IDE.
nailed:
dirdival@pld:/tmp/hive$ git cat-file -p a4bddadd3678dcdd96c075a96fa725f5b92c685a
tree 8392a8140d924018067ef4ece3c2260c7427784d
parent 6683eecc36487d8a23834a623a1f74daec899695
author Jack M <jack@neutrino.edb.devlan.net> 1377009112 -0400
committer Jack M <jack@neutrino.edb.devlan.net> 1377009112 -0400
Added files and directories related to Eclipse IDE.
It wasn’t too hard to collect all data and uncover real emails. Mapping:
User #140 <Account.155@devlan.net> - Jack M <jack@neutrino.edb.devlan.net>
User #142 <Account.156@devlan.net> - Jack M <jackmc@devlan.net>
User #217 <Account.227@devlan.net> - miker <user@hive-builder.edb.devlan.net>
User #226 <Account.233@devlan.net> - miker <miker@localhost.localdomain>
User #226 <Account.234@devlan.net> - miker <miker@stash.devlan.net>
User #226 <Account.235@devlan.net> - miker <miker@devlan.net>
How you can see on this repo work two developers, but they use different
machines to commit data. According to emails both of them are team members of Embedded Devices Branch (EDB)
.
More information about this department you can find on wikileaks, also there is a chart
shows structure of CIA.
Extra speculations about developers and environment
Miker
has got access to at list 2 machines:
- hive-builder.edb.devlan.net,
- stash.devlan.net.
The first one hive-builder
looks like a machine for making binary releases.
Second one stash
is a repo server. I have seen many commits with
description:
Merge branch 'idkey' of ssh://stash.devlan.net:7999/hive/hive into idkey
According to this speculation we can assume that miker
has higher rank in
EDB
than jackmc
.
Moreover, I compared number of commit made by developers:
dirdival@pld:/tmp/hive$ cat out.txt | grep miker | wc -l
37
dirdival@pld:/tmp/hive$ # 37
dirdival@pld:/tmp/hive$ cat out.txt | grep jackmc | wc -l
308
Jackmc
made much more commits but most of them show that he is a sloppy developer, small probe of changelog:
Author: User #142 <jackmc@devlan.net>
Date: Tue Oct 7 11:02:27 2014 -0400
Numerous fixes. Still debugging...
commit c8fb7630c77c7e8699268b1f7fa945bb229461de
Author: User #140 <jack@neutrino.edb.devlan.net>
Date: Mon Oct 6 15:57:19 2014 -0400
ILM-client working, but with xterm. To change see Command.cpp:320
commit 4f0e918cd8c89aab12939bc01f850226a42d40a6
Author: User #142 <jackmc@devlan.net>
Date: Mon Oct 6 15:01:48 2014 -0400
Fix oops.
commit 0a37703c10d4381c8bb18bb4d7deaf2796b0d674
Author: User #142 <jackmc@devlan.net>
Date: Mon Oct 6 14:22:12 2014 -0400
Tweaks
commit 5f53fb84421a2c822b1cd98f475eb1e9a6e2d038
Author: User #140 <jack@neutrino.edb.devlan.net>
Date: Mon Oct 6 09:31:18 2014 -0400
Fix ILM-client debugging.
We have rapid development here and this descriptions are present in main branch.
Interesting is his machine: neutrino.edb.devlan.net
. I haven’t found more information
about it.
Code
First, sad information. History of repo starts from version 2.6.1
. According to initial comit repo
was moved from subversion
into git
. It was done in the simplest way, without care about past data.
More details about releases I found in Hive 2.9 User's Guide
(git object 1315c15d006e6783b208643602d85f423086e013) – created on May 21, 2015 – informantion
about first release: 10/26/2010, Initial Release v1.0, Authority TDR
. 02/14/2011
released v1.0.2
and according to notes it should work on all supported architectures/systems:
- Windows,
- Linux/MikroTik MIPS,
- Linux/MikroTik PPC,
- Solaris 9-10 x86 & Linux x86.
Repository hierarchy
In common
directory we have bzip2
, polarssl
and ssl
– OpenSouce
libraries.
Client
is stored in client
directory
and server
I suppose that you can guess – yes, inside server
. Also, we have
infrastructure
, documentation
with lyrics. Moreover, honeycomb
is a server written in python
, definitely
author is not a native python developer. In user guide we can find this note: Honeycomb acts much like a
traditional iterative server that handles incoming beacon connections one-by-one. I didn’t focus on this code.
Finally, we have snapshot_*
directories. In my opinion storing binary releases in git
repo is bad idea.
It’s worth mentioning that in client/ctHive/ILMSDK
we have beautiful code in C++
, mostly XML
parser. Obviously
it wasn’t made by this team.
The last but not the least, there is not tests here. If you are a programmer you should know what does it mean – troubles.
client and server review
Hive
developers disappointed me. I expected something interesting
but we have in most places mess. Small example, master
branch client/misc.c
:
void DisplayStatus(struct proc_vars* info) {
int argc = 0;
char* message;
// fprintf(stdout, "\n****************************************************************\n\n");
//fprintf(stdout, "\n %sSession configuration parameters:%s\n", BLUE, RESET);
fprintf(stdout, "\n %s%s:%s\n", BLUE, sessionConfigParamString, RESET);
/*
if (info->listen == NO) {
fprintf(stdout, " TCP socket type = connect (active)\n");
} else {
fprintf(stdout, " TCP socket type = listen (passive)\n");
}
*/
if (info->interactive == YES) {
//fprintf(stdout, " . Interactive mode established\n");
// fprintf(stdout, "%s", interactiveModeString);
} else {
if (info->ignore == NO) {
//fprintf(stdout, " Automatic mode established (not ignoring errors)\n");
fprintf(stdout, "%s", automaticMode1String);
} else {
//fprintf(stdout, " Automatic mode established (ignoring errors)\n");
fprintf(stdout, "%s", automaticMode2String);
}
fprintf(stdout, "%s", message);
free(message);
Variable message
is not initialized and we print something and we (sic!) release it. Madness.
Let’s assume that is was accident. Maybe server looks better? Part of server/client_session.c
:
//******************************************************************
/*!
* Download a file from the local system to the command post
* @param path - complete path and filename
* @param size - size of file
* @param sock - socket
* @return
*/
int DownloadFile(char *path, unsigned long size, int sock)
{
REPLY ret; // Reply struct
unsigned char data[DATA_BUFFER_SIZE];
struct stat buf;
FILE *fd;
int bytes_read, bytes_written;
//TODO: Review and fix/remove this.
// to silence compiler warnings. this var no longer needed because of the
// ssl_context declared global to this file
sock = sock;
Just take a look on comment TODO
near sock
and this assignment:
sock = sock;
Could be even worse? Yes, It just a beginning. server/netstat_an.c
:
//Called by beacon.c to release data
void release_netstat_an(unsigned char* netstat_an)
{
if(netstat_an != NULL)
{
free(netstat_an);
netstat_an = NULL;
}
return;
}
They don’t know that assign NULL
it this case has no effect outside the function.
And this return
on the end – beautiful.
Yet another example of poor development – server/persistence.c
:
int EnablePersistence(char* beaconIP, int beaconPort)
{
//TODO: just to silence the compiler warning
beaconIP++;
beaconPort++;
return 0;
}
I’m curious why people store that code server/decode_dns.h
:
#if 0
// Broken code that isn't used anyway
typedef struct {
char name[NS_MAXDNAME]; // <--- this doesn't work here
DNS_rr_data rrmetadata;
const u_char *rdata;
} DNS_rr;
#endif
Does author think that anybody uncomment it?
server/survey_mac.c
:
// lower three byte are psuedo-random
mac[3] = (unsigned char) (htonl(rand()) >> 24);
mac[4] = (unsigned char) (htonl(rand()) >> 24);
mac[5] = (unsigned char) (htonl(rand()) >> 24);
Definitely, call htonl()
on random()
makes result more random.
And my favorite one from server/daemonize.c
:
int daemonize( void )
{
int i;
pid_t pid;
char devnull[10];
#ifdef SOLARIS
// set process max core file size to zero.
struct rlimit corelimit = { 0, 0 };
#endif
devnull[0] = '/'; devnull[5] = 'n';
devnull[1] = 'd'; devnull[6] = 'u';
devnull[2] = 'e'; devnull[7] = 'l';
devnull[3] = 'v'; devnull[8] = 'l';
devnull[4] = '/'; devnull[9] = '\0';
I h a v e n o i d e a w h a t I ' m d o i n g h e r e
. Clearly evidence
that EDB
doesn’t have any code review.
Conclusion
We can laugh at poor code, but the truth is that this software works. In one year
they made product able to interact with most used systems/architectures.
According to user guide the most problems they have with Solaris
and Linux
on MIPS
, quote:
As of Hive version 2.9, Solaris and MIPS little-endian architectures are no longer supported.
Moreover dump of repo comes from 2015 and we can assume that new versions were released.
I suppose that both mentioned developers right now are only maintainers of this code. They fix bugs, don’t
add new features. Also, they should keep eye on new releases of supported systems/architectures.
One thing just scared me. I expected much more from wikileaks
, they didn’t make properly data anonymization.
It is a real problem for whistleblowers.