D vs. C#

Mon Oct 22 15:49:33 PDT 2007

On Tue, 23 Oct 2007 00:54:40 +0300, Walter Bright <newshound1 at digitalmars.com> wrote:

> Vladimir Panteleev wrote:
>> On Mon, 22 Oct 2007 05:19:39 +0300, Walter Bright
>> <newshound1 at digitalmars.com> wrote:
>>
>>> I've never been able to discover what the fundamental advantage of
>>> a VM is.
>>
>> Some of the things which are only possible, or a good deal easier to
>> use/implement with VMs:
>>
>> 1) code generation - used very seldomly, it might be used for
>> runtime-specified cases where top performance is required (e.g.
>> genetic programming?)
>
> Are you referring to a JIT? JITs aren't easier to implement than a
> compiler back end.

I'm referring about using the standard library to emit code. This allows to generate arbitrary code at runtime, without having to bundle a compiler or compiler components with your program. Integration with existing code is also available, so you could create an on-the-fly class that is derived from a "hard-coded" class in the application. The use case I mentioned is genetic programming - a technique where genetic evolution argorithms are applied to bytecode programs, and in this case it is desireable for the generated programs to run at maximum speed without compromising the host's stability.

>> 2) VMs make modularity much easier in that you don't have to
>> recompile all modules ("plugins") on all platforms, which is often
>> not possible with projects whose core supports many platforms, but
>> most developers don't have access to all supported platforms.
>
> Problem is solved by defining your ".class" file to be compressed source
> code. Dealing with back end bugs on platform X is no different from
> dealing with VM bugs on platform X. Java is infamous for "compile once,
> debug everywhere".

Yes, though I didn't mention debugging. Otherwise, see below.

>> 3) very flexible reflection - like being able to derive from classes
>> in other modules. Though this can be done in native languages by
>> including enough metadata, most compiled languages don't.
>
> I think this is possible with compiled languages, but nobody has done it
> yet.

I believe DDL was going in that direction.

>> 4) compilation is not a simple process for most computer users out
>> there.
>
> Since the VM includes a JIT (a compiler) and runs it transparently to
> the user, there's no reason that compiler couldn't compile source code
> into native code transparently to the user.

Indeed. Infact, most of the issues I mentioned can be solved by distributing source code instead of intermediary bytecode. Actually, if you compare the Java/.NET VM with a hypothetical system which compiles the source code and runs the binary on the fly, the difference is pretty low - it's just that bytecode is one level lower than source code (and source code parsing/lexing would slow down compilation to native code by some degree). 

I don't think it would be hard to turn D into a VM just like .NET - just split the front-end from the back-end, make the front-end serialize the AST and distribute a back-end that reads ASTs, "JITs" them, links to Phobos/other libraries and runs them. You could even scan the AST for unsafe code (pointers, some types of casts), add that with forced bounds checking, and you have a "safe" D VM/compiler. So, I'd like to ask - what exactly are we debating again? :)

When comparing VMs (systems that compile to bytecode) to just distributing the source code (potentially wrapping it in a bundle or framework that can automatically compile and run the source for the user), the later inherits all the disadvantages of the VM (slow on first start, as the source code has to be compiled; the source or some other high-level source structures can be extracted; etc.). The only obvious advantage is that the source is readily available in case it's necessary to debug the application, but Java already has the option to include the source in the .jar file (although this causes it to include code in both bytecode and source). 

If we assume that all bytecode or source is compiled before it's ran (nothing is interpreted), as should happen in a "perfect" VM, the term "VM" loses much of its original meaning. The only thing left is the restrictions imposed on the language (no unsafe constructs like pointers) and means to operate on the AST (reflection, code generation, etc.) Taking that into consideration, comparing a perfect "VM" with distributing native code seems to make slow start-up and the bulky VM runtime the only disadvantages of using VMs. (Have I abstractized so much that I'm forgetting something important here?)

>> 5) it's much easier to provide security/isolation for VM languages.
>> Although native code isolation can be done using hardware, it's
>> complicated and inefficient.
>
> The virtualization hardware works very well! It's complex, but it is far
> more efficient than a VM is. In fact, you're likely to be running on a
> hardware virtualized machine anyway!

Unfortunately, virtualization extensions are not available on all platforms - and implementing sandboxing on platforms where it's not supported by hardware would be quite complicated (involving disassembly, recompilation or interpretation). VirtualBox is a nice part-open-source virtualization product, and they stated that the software virtualization they implemented is faster than today's hardware virtualization.

>> This allows VM languages to be safely
>> embedded in places such as web pages (Flash for ActionScript, applets
>> for Java, Silverlight for .NET).
>
> It is not necessary to have a VM to achieve this. If you design a
> language that does not have arbitrary pointers, and you control the code
> generation, you can sandbox it in software every bit as effectively.
> This is why, for example, the Java JITs don't compromise their security
> model.

This requires that the code is given at a level high enough where this is enforceable - that is, either at source or bytecode/AST level.

I also thought of another point (though it only stands against distributing native code binaries, not self-compiling source code):
6) Bytecode can be compiled to optimized code for the specific environment it is run on (processor vendor and family). It's not a big plus, just a "nice" advantage.

-- 
Best regards,
 Vladimir                          mailto:thecybershadow at gmail.com