vibe.d-lite v0.1.0 powered by photon

Thu Sep 25 07:24:00 UTC 2025

Am 23.09.25 um 17:35 schrieb Dmitry Olshansky:
> On Monday, 22 September 2025 at 11:14:17 UTC, Sönke Ludwig wrote:
>> Am 22.09.25 um 09:49 schrieb Dmitry Olshansky:
>>> On Friday, 19 September 2025 at 17:37:36 UTC, Sönke Ludwig wrote:
>>>> So you don't support timeouts when waiting for an event at all? 
>>>> Otherwise I don't see why a separate API would be required, this 
>>>> should be implementable with plain Posix APIs within vibe-core-lite 
>>>> itself.
>>>
>>> Photon's API is the syscall interface. So to wait on an event you 
>>> just call poll.
>>> Behind the scenes it will just wait on the right fd to change state.
>>>
>>> Now vibe-core-light wants something like read(buffer, timeout) which 
>>> is not syscall API but maybe added. But since I'm going to add new 
>>> API I'd rather have something consistent and sane not just a bunch of 
>>> adhoc functions to satisfy vibe.d interface.
>>
>> Why can't you then use poll() to for example implement `ManualEvent` 
>> with timeout and interrupt support? And shouldn't recv() with timeout 
>> be implementable the same way, poll with timeout and only read when 
>> ready?
> 
> Yes, recv with timeout is basically poll+recv. The problem is that then 
> I need to support interrupts in poll. Nothing really changed.
> As far as manual event goes I've implemented that with custom cond var 
> and mutex. That mutex is not interruptible as it's backed by semaphore 
> on slow path in a form of eventfd.
> I might create custom mutex that is interruptible I guess but the notion 
> of interrupts would have to be introduced to photon. I do not really 
> like it.

I'd probably create an additional event FD per thread used to signal 
interruption and also pass that to any poll() that is used for 
interruptible wait.

>> I think we have a misunderstanding of what vibe.d is supposed to be. 
>> It seems like you are only focused on the web/server role, while to me 
>> vibe-core is a general-purpose I/O and concurrency system with no 
>> particular specialization in server tasks. With that view, your 
>> statement to me sounds like "Clearly D is not meant to do multi- 
>> threading, since main() is only running in a single thread".
> 
> The defaults are what is important. Go defaults to multi-threading for 
> instance.
> D defaults to multi-threading because TLS by default is certainly a mark 
> of multi-threaded environment. std.concurrency defaults to new thread 
> per spawn, again this tells me it's about multithreading. I intend to 
> support multi-threading by default. I understand that we view this issue 
> differently.

But you are comparing different defaults here. With plain D, you also 
have to import either `core.thread` or 
`std.concurrency`/`std.paralellism` to do any multi-threaded work. The 
same is true for vibe-core. What you propose would be more comparable to 
having foreach() operate like parallelForeach(), with far-reaching 
consequences.

If we are just talking about naming - runTask/runWorkerTask vs. 
go/goOnSameThread - that is of course debatable, but in that case I 
think it's blown very much out of proportion to take that as the basis 
to claim "it's meant to be used single-threaded".

>>>> Anything client side involving a user interface has plenty of 
>>>> opportunities for employing secondary tasks or long-running sparsely 
>>>> updated state logic that are not CPU bound. Most of the time is 
>>>> spent idle there. Specific computations on the other hand can of 
>>>> course still be handed off to other threads.
>>>
>>> Latency still going to be better if multiple cores are utilized.
>>> And I'm still not sure what the example is.
>>
>> We are comparing fiber switches and working on data with a shared 
>> cache and no synchronization to synchronizing data access and control 
>> flow between threads/cores. There is such a broad spectrum of 
>> possibilities for one of those to be faster than the other that it's 
>> just silly to make a general statement like that.
>>
>> The thing is that if you always share data between threads, you have 
>> to pay for that for every single data access, regardless of whether 
>> there is actual concurrency going on or not.
> 
> Obviously, we should strive to share responsibly. Photon has Channels 
> much like vibe-core has Channel. Mine are MPSC though, mostly to model 
> Input/Output range concepts.

True, but it's still not free (as in CPU cycles and code complexity) and 
you can't always control all code involved.

>> If you want a concrete example, take a simple download dialog with a 
>> progress bar. There is no gain in off-loading anything to a separate 
>> thread here, since this is fully I/O bound, but it adds quite some 
>> communication complexity if you do. CPU performance is simply not a 
>> concern here.
> 
> Channels tame the complexity. Yes, channels could get more expansive in 
> multi-threaded scenario but we already agreed that it's not CPU bound.

If you have code that does a lot of these things, this just degrades 
code readability for absolutely no practical gain, though.

>>>> The problem is that for example you might have a handle that was 
>>>> created in thread A and is not valid in thread B, or you set a state 
>>>> in thread A and thread B doesn't see that state. This would mean 
>>>> that you are limited to a single task for the complete library 
>>>> interaction.
>>>
>>> Or just initialize it lazily in all threads that happen to use it.
>>> Otherwise, this is basically stick to one thread really.
>>
>> But then it's a different handle representing a different object - 
>> that's not the same thing. I'm not just talking about initializing the 
>> library as a whole. But even if, there are a lot of libraries that 
>> don't use TLS and are simply not thread-safe at all.
> 
> Something that is not thread-safe at all is a dying breed. It's been 20 
> years that we have multi-cores. Most libraries can be initialized once 
> per thread which is quite naturally modeled with TLS handle to said 
> library. Communicating between fibers via shared TLS handle is not 
> something I would recommend regardless of the default spawn behavior.

Unfortunately, those libraries are an unpleasant reality that you can't 
always avoid.

BTW, one of the worst offenders is Apple's whole Objective-C API. 
Auto-release pools in particular make it extremely fragile to work with 
fibers at all and of course there are all kinds of hidden thread 
dependencies inside.

>>>> This doesn't make sense, in the original vibe-core, you can simply 
>>>> choose between spawning in the same thread or in "any" thread. 
>>>> `shared`/`immutable` is correctly enforced in the latter case to 
>>>> avoid unintended data sharing.
>>>
>>> I have go and goOnSameThread. Guess which is the encouraged option.
>>
>> Does go() enforce proper use of shared/immutable when passing data to 
>> the scheduled "go routine"?
> 
> It goes with the same API as we have for threads - a delegate, so 
> sharing becomes user's responsibility. I may add function + args for 
> better handling of resources passed to the lambda.

That means that this is completely un`@safe` - C++ level memory safety. 
IMO this is an unacceptable default for web applications.

>>>> The GC/malloc is the main reason why this is mostly false in 
>>>> practice, but it extends to any central contention source within the 
>>>> process - yes, often you can avoid that, but often that takes a lot 
>>>> of extra work and processes sidestep that issue in the first place.
>>>
>>> As is observable from the look on other languages and runtimes malloc 
>>> is not the bottleneck it used to be. Our particular version of GC 
>>> that doesn't have thread caches is a bottleneck.
>>
>> malloc() will also always be a bottleneck with the right load. Just 
>> the n times larger amount of virtual address space required may start 
>> to become an issue for memory heavy applications. But even if ignore 
>> that, ruling out using the existing GC doesn't sound like a good idea 
>> to me.
> 
> The existing GC is basically 20+ years old, ofc we need better GC and
> thread cached allocation solves contention in multi-threaded environments.
> Alternative memory allocator is doing great on 320 core machines. I 
> cannot tell you which allocator that is or what exactly these servers 
> are. Though even jemalloc does okayish.
> 
>> And the fact is that, even with relatively mild GC use, a web 
>> application will not scale properly with many cores.
> 
> Only partially agree, Java's GC handles load just fine and runs faster 
> than vibe.d(-light). It does allocations on its serving code path.

I was just talking about the current D GC here. Once we have a better 
implementation, this can very well become a much weaker argument!

However, speaking more generally, the other arguments for preferring to 
scale using processes still stand, and even with a better GC I would 
still argue that leading library users to do multi-threaded request 
handling is not necessarily the best default (of course it still *can* 
be for some applications).

Anyway, the main point from my side is just that the semantics of what 
*is* in vibe-core-light should really match the corresponding functions 
in vibe-core. Apart from that, I was just telling you that your 
impression of it being intended to be used single-threaded is not right, 
which doesn't mean that the presentation shouldn't probably emphasize 
the multi-threaded functionality and multi-threaded request processing more.

>>>> Separate process also have the advantage of being more robust and 
>>>> enabling seamless restarts and updates of the executable. And they 
>>>> facilitate an application design that lends itself to scaling across 
>>>> multiple machines.
>>>
>>> Then give me the example code to run multiple vibe.d in parallel 
>>> processes (should be simillar to runDist) and we can compare 
>>> approaches. For all I know it could be faster then multi-threaded 
>>> vibe.d-light. Also honestly if vibe.d's target is multiple processes 
>>> it should probably start like this by default.
>>
>> Again, the "default" is a high-level issue and none of vibe-core's 
>> business. The simplest way to have that work is to use 
>> `HTTPServerOption.reusePort` and then start as many processes as desired.
> 
> So I did just that. To my surprise it indeed speeds up all of my D 
> server examples.
> The speed ups are roughly:
> 
> On vibe-http-light:
> 8 cores 1.14
> 12 cores 1.10
> 16 cores 1.08
> 24 cores 1.05
> 32 cores 1.06
> 48 cores 1.07
> 
> On vibe-http-classic:
> 8 cores 1.33
> 12 cores 1.45
> 16 cores 1.60
> 24 cores 2.54
> 32 cores 4.44
> 48 cores 8.56
> 
> On plain photon-http:
> 8 cores 1.15
> 12 cores 1.10
> 16 cores 1.09
> 24 cores 1.05
> 32 cores 1.07
> 48 cores 1.04
> 
> We should absolutely tweak vibe.d TechEmpower benchmark to run vibe.d as 
> a process per core! As far as photon-powered versions go I see there is 
> a point where per-process becomes less of a gain with more cores, so I 
> would think there are 2 factors at play one positive and one negative, 
> with negative being tied to the number of processes.
> 
> Lastly, I have found opportunities to speed up vibe-http even without 
> switching to vibe-core-light. Will send PRs.

Interesting, I wonder whether its the REUSE_PORT connection distribution 
that gets more expensive when it's working cross-process. Agreed that 
the TechEmpower benchmark is in dire need of being looked at. In fact I 
had the code checked out for a long while, intending to look into it, 
because it obviously didn't scale like my own benchmarks, but then I 
never got around to do it, being to busy with other things.