h a l f b a k e r y
May contain nuts.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
or get an account
Please log in.
Before you can vote, you need to register.
Please log in or create an account.
The 7-zip executable and DLLs on my Windows machine
two megabytes in total, and the 7-zip algorithms provide
some of the best compression and decompression out
there. Not everyone has 7-zip installed, but that's not a
deal; in most compression applications, carrying this 2MB
code would be a negligible cost, so someone two or
three decades ago came up with the brilliant idea of a
extracting archive. With that technology, you can provide
someone an archive with arbitrarily good algorithms
without requiring the end user to install a particular
decompression utility. The critical problem with a self-
extracting archive is that opening an executable file
an unknown source carries the risk of malware infection.
Taking a lesson from web browser scripting, I propose a
universal compression container wherein an arbitrary
executable is given access ONLY to a target folder,
forbidding access to memory, devices, and files outside
the necessary scope. Someone more clever than me
even be able to write a wrapper for an existing self-
extracting archive to provide this containerization. (You
COULD already do this with a type 2 hypervisor if you'd
like, such as VirtualBox, but that's not convenient.) The
key advantage is that an AI engine would be able to use a
genetic algorithm to produce an optimally reduced
and then provide within the archive an arbitrary
executable for decompression, not relying on any
format. No further improvements in compression
technology would render this new format obsolete
such improvements would simply be represented within
this format. (How long has the EXE file been around?)
||The contained decompression program can fail to terminate, or it
"blow up" and attempt to allocate infinite memory. In a Turing-
complete language it is impossible to prove whether or not an
arbitrary program will do this without actually running it and seeing
||You may be able to get rid of this possibility by defining a non-
Turing-complete bytecode language, but I expect this will limit the
the decompression algorithms that it will implement.
||"Proof-carrying code" techniques might help, too, though I'm not
aware of them being used for anything outside academia.
||I don't understand the technicalities well enough, but I do applaud the notion of having a "quarantine folder" for suspect files, such that they can be contained whilst they're assessed.
||[wrongfellow] At applications like video, which might have
frames, wouldn't it be fairly easy to see if the potentially
"blown up" nonterminating data was still a video frame?
||This could bring continuously improving AI algorithms to
video on demand. I like that as just making file
compression twice as good, and computers 10x better could
fit most things people do into the "cheap data" rung of post
||[Max]: I think [kevinthenerd] was talking about quarantining the
decompression process rather than the newly downloaded files
themselves, but yes, this is a very valuable way to treat untrusted
files - in the extreme, you could use a physically separate computer,
but modern computers can simulate other computers (per Turing-
completeness) and it's silly not to look for security benefits from
||[beanie]: //the potentially "blown up" nonterminating data// cannot in
general be interpreted without running the program to completion
and allowing it to produce its output - which is sometimes
||So what you want is a program which is universally installed, for the case when you can't arrange for a standard program to be present?