In previous part, we have learnt how to debug d8
with lldb
. In this part
we still use our debug.js
script. But we’ll add an other parameter to d8
before running process.
lldb -- d8 --parse-only debug.js
With parse-only
flag, we will forcus on how d8
parse source content and
turn into lexical input elements.
Let’s add a breakpoint in v8::Shell::RunMain
function and jump to it.
(lldb) b v8::Shell::RunMain
(lldb) run
frame #0: 0x000000010004ba68 d8`v8::Shell::RunMain(isolate=0x0000000118110000, last_run=true) at d8.cc:4578:12
4575 }
4576
4577 int Shell::RunMain(Isolate* isolate, bool last_run) {
-> 4578 for (int i = 1; i < options.num_isolates; ++i) {
4579 options.isolate_sources[i].StartExecuteInThread();
4580 }
4581 bool success = true;
At line 4578, we will loop through different isolate sources and execute them
in threads. But as our options.num_isolates = 1
, we don’t need to worry about
multi-threads at this point.
|
|
Let’s jump into Execute(isolate)
function.
|
|
Ource debug.js
will be read into Local<String> source
variable. So let take
a look at v8::Shell::ReadFile
.
b v8::Shell::ReadFile
c
|
|
|
|
If you feel overwhelmed by above functions, don’t worry, I did too. But focus on
highlighted lines, I added the functions because they are so interesting. Source
file can be very small as our debug.js
, or very large, I actually don’t know
if ECMAScript has a spec for max source size or not. But anyway, V8 scanner will
scan source code token by token, so loading the whole file into memory is not
efficient. And there will be tons of small read, seek
operations on source
file. Even though kernel code is very fast to do such operations, we can still
improve by using mmap
You can take a break on this
blog, go checkout
explaination video of mmap
and then come back.
(*) Disclaimer: At the time of writing this part, I also discover mmap
. So
my explaination can be incorrect.
Hey, you’re back. So you know what’s nice about mmap
.
Hence, V8 only seeks for the end of file position to know file size and then
maps source file to a address in V8
virtual memory space. Run thread step-out
or f
to jump up to Execute
function.
Now we can check if source informaiton is handled correctly.
(lldb) p arg
(const char *) $48 = 0x000000016fdff555 "debug.js"
(lldb) p source->Length()
(int) $49 = 55
File name is correct, and content length looks good.
Next, v8::Shell::ExecuteString
function call will execute our source script.
We set a breakpoint at that function and continue process.
(lldb) b v8::Shell::ExecuteString
Breakpoint 1: where = d8`v8::Shell::ExecuteString(v8::Isolate*, v8::Local<v8::String>, v8::Local<v8::String>, v8::Shell::PrintResult, v8::Shell::ReportExceptions, v8::Shell::ProcessMessageQueue) + 140 at d8.cc:685:15, address = 0x000000010002b2bc
c
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x000000010002b2bc d8`v8::Shell::ExecuteString(isolate=0x0000000118110000, source=(val_ = 0x0000000102020060), name=(val_ = 0x0000000102020058), print_result=kNoPrintResult, report_exceptions=kReportExceptions, process_message_queue=kProcessMessageQueue) at d8.cc:685:15
682 Local<String> name, PrintResult print_result,
683 ReportExceptions report_exceptions,
684 ProcessMessageQueue process_message_queue) {
-> 685 i::Isolate* i_isolate = reinterpret_cast<i::Isolate*>(isolate);
686 if (i::FLAG_parse_only) {
687 i::VMState<PARSER> state(i_isolate);
688 i::Handle<i::String> str = Utils::OpenHandle(*(source));
Target 0: (d8) stopped.
In this function, d8
opens the source and parse script inside it.
|
|
Hopefully now you are more familiar with lldb
step movements like step over,
step into, adding breakpoint and continue to next breakpoint. I will skip lldb
commands like run, breakpoint set, continue, thread step-in, thread step-out
.
Now we take a look at v8::internal::parsing::ParseProgram
function. Because it
is short, I put the whole function code here.
|
|
As you can see, we read source code by a character stream, but at this point,
our stream is still empty. Let’s step into parser.ParseProgram
call.
|
|
After a few internal checks, the scanner
start initializing. This is where our
code being translated to Javascript lexical elements. The structure of scanner
source stream is documented in header file.
Fields describing the location of the current buffer physically in memory,
and semantically within the source string.
0 buffer_pos_ pos()
| | |
v________________________v___v_____________
| | | |
Source string: | | Buffer | |
|________________________|________|________|
^ ^ ^
| | |
Pointers: buffer_start_ | buffer_end_
buffer_cursor_
Let’s checkout what is in current buffer.
(lldb) p scanner()->source_->buffer_start_
(const uint16_t *) $40 = 0x0000000102016032
(lldb) x/s 0x0000000102016032
0x102016032: "l"
We got our first character of source file ( l
in let
keywork). Because we
knew source length is 55. We can print out the whole file.
(lldb) x/55s 0x0000000102016032
0x102016032: "l"
0x102016034: "e"
0x102016036: "t"
0x102016038: " "
0x10201603a: "a"
0x10201603c: " "
0x10201603e: "="
0x102016040: " "
0x102016042: "1"
0x102016044: "\n"
0x102016046: "v"
0x102016048: "a"
0x10201604a: "r"
0x10201604c: " "
0x10201604e: "b"
0x102016050: " "
0x102016052: "="
0x102016054: " "
0x102016056: "2"
0x102016058: "\n"
0x10201605a: "c"
0x10201605c: "o"
0x10201605e: "n"
0x102016060: "s"
0x102016062: "t"
0x102016064: " "
0x102016066: "c"
0x102016068: " "
0x10201606a: "="
0x10201606c: " "
0x10201606e: "3"
0x102016070: "\n"
0x102016072: "c"
0x102016074: "o"
0x102016076: "n"
0x102016078: "s"
0x10201607a: "o"
0x10201607c: "l"
0x10201607e: "e"
0x102016080: "."
0x102016082: "l"
0x102016084: "o"
0x102016086: "g"
0x102016088: "("
0x10201608a: "a"
0x10201608c: " "
0x10201608e: "+"
0x102016090: " "
0x102016092: "b"
0x102016094: " "
0x102016096: "+"
0x102016098: " "
0x10201609a: "c"
0x10201609c: ")"
0x10201609e: "\n"
That’s exact our script from very first part. Wow! I can’t believe we go this far. But it’s just the beginning. I’ll see you next time.