VMware Workstation PVSCSI LFH Escape (VMware-vmx on Windows 11)

Tip

AWS हैकिंग सीखें और अभ्यास करें:HackTricks Training AWS Red Team Expert (ARTE)
GCP हैकिंग सीखें और अभ्यास करें: HackTricks Training GCP Red Team Expert (GRTE) Azure हैकिंग सीखें और अभ्यास करें: HackTricks Training Azure Red Team Expert (AzRTE)

HackTricks का समर्थन करें

सदस्यता योजनाओं की जांच करें!

हमारे 💬 Discord समूह या टेलीग्राम समूह में शामिल हों या हमें Twitter 🐦 @hacktricks_live** पर फॉलो करें।**

हैकिंग ट्रिक्स साझा करें और HackTricks और HackTricks Cloud गिटहब रिपोजिटरी में PRs सबमिट करें।

बग संरचना: fixed-size realloc + scattered OOB writes

PVSCSI_FillSGI guest scatter/gather entries को internal array में copy करता है। यह 512-entry static buffer (0x2000) से शुरू होता है। 512 entries से ऊपर यह reallocates करके 0x4000 bytes बनाता है और एक functional bug की वजह से हर iteration पर reallocates करता है।
पुनः आवंटन का आकार कभी नहीं बढ़ता: 0x4000 / 0x10-byte entries = 1024 usable entries। जब guest >1024 entries भेजता है, तो हर नया entry ताज़ा आवंटित 0x4000 chunk के 16 bytes आगे लिखा जाता है, जिससे adjacent chunk header या object भ्रष्ट हो जाता है।
Overflow content: VMware {u64 addr; u64 len} स्टोर करता है; guest {u64 addr; u32 len; u32 flags} प्रदान करता है। 32-bit len zero-extended होता है, इसलिए हर 16-byte OOB element का अंतिम dword हमेशा 0x00000000 होता है।

LFH constraints & deterministic “Ping-Pong” placement

0x4000 allocations Windows 11 LFH में landen करते हैं (16 chunks/bucket, 0x10-byte metadata keyed checksum के साथ)। किसी भी chunk का header checksum अगर बाद में हिट हुआ तो process terminate हो जाएगा, इसलिए corrupted headers को कभी reuse नहीं करना चाहिए।
LFH एक random free chunk लौटाता है, लेकिन सबसे हाल में free किए गए chunk वाले bucket को प्राथमिकता देता है। केवल दो free slots force करें: 1.Allocator को align करने के लिए सभी free 0x4000 chunks allocate करें; स्प्रे करें 32 SVGA shaders ताकि B1 और B2 buckets भर जाएँ।

B1 को छोड़कर सभी shader free कर दें सिवाय एक pinned shader के (Hole0) ताकि B1 सक्रिय रहे; B1 में 15 URBs allocate करें।
B2 में से एक shader free करें (PONG), फिर तुरंत Hole0 free करें। LFH उपलब्ध दो slots के बीच allocations को alternating करेगा: PING (B1) और PONG (B2)।

Iteration 1025 PONG के बाद header को करप्ट करता है (जिसे फिर कभी नहीं छुआ जाएगा); iteration 1026 PING के बाद URB के पहले 16 bytes को hit करता है (safe metadata bypass)। layout को स्थिर और repeatable रखने के लिए PING/PONG को placeholder shaders से reclaim करें और दोहराएँ।

Reap Oracle: contiguous holes का लेबलिंग

UHCI URBs FIFO queue में रहते हैं और तब free होते हैं जब वे पूरी तरह से reaped हो जाते हैं। यह constrained 16-byte overwrite हमेशा actual_len को zero कर देता है, जिससे एक marker मिल जाता है।
URBs को क्रम में reap करें; जब zeroed actual_len दिखे तो तत्काल उस freed slot को एक पहचानने योग्य shader से refill करें। इटरेट करने से आप Hole0–Hole3 को चार contiguous chunks के रूप में ज्ञात क्रम में map कर पाएँगे, जो बाद में adjacency-dependent primitives के लिए उपयोगी हैं।

Constrained writes को arbitrary overwrite में बदलना (coalescing abuse)

PVSCSI adjacent entries को coalesce करता है जब AddrA + LenA == AddrB और बाद के entries को ऊपर compact करता है।

Two-pass overflow: PING (odd indices) से शुरू करके trigger करें और coalescing छोड़ने के लिए जल्दी exit करें; फिर PONG (even indices) से फिर trigger करें ताकि gaps भरे जाएँ और लिखना sprayed shader में मौजूद fake S/G entries में जारी रहे।
Vacuum + payload: entries [1023..2047] को {addr=0,len=0} सेट करें ताकि coalescing उन्हें एक में collapse कर दे, जिससे एक logical hole बने। इसके बाद shader में रखे payload entries ऊपर move होते हैं और victim URB के अंदर land कर जाते हैं।
Adjacency-check bypass: LenA=0 सेट करके condition AddrA==AddrB बन जाती है। जोड़ी बनाएँ

{addr = X, len = 0}
{addr = X, len = Y}

ताकि coalescing इन्हें {addr=X,len=Y} में मर्ज कर दे। constrained overflow से even-indexed zero-size elements आते हैं; odd-indexed values shader में रहते हैं। नतीजा: forced zero dword के बावजूद arbitrary 16-byte patterns।

Hybrid URB infoleak via coalescing side-effects

Contiguous chunks इस तरह से व्यवस्थित करें: [Hole0 (free/PING), URB1 (target), URB2 (valid, actual_len=0), URB3 (leak target)].
URB1 को contiguous fake entries (sizes 0xFFFFFFFF) से भरें, URB2 को न्यूनतम रूप से छुएँ। Coalescing उन्हें एक में merge कर देता है; sum 0xFFFFFFFF * 0x401 URB1 के actual_len offset पर upper dword को 0x400 कर देता है।
Compaction अगला डेटा ऊपर copy कर देता है, जिससे URB2 का header URB1 में खिंच आता है। अब URB1 के पास एक वैध header (pipe/list pointers), actual_len=0x400, और data pointer है जो पहले से URB2 के buffer के अंत पर स्थित है।
URB1 को reap करने पर URB3 से ठीक पहले के स्थान से 0x400 bytes copy हो जाते हैं, जिससे URB3 के header/self-references का OOB read होता है जो absolute heap addresses प्रकट करता है और ASLR को हराता है ताकि बाद में forged structures बनाए जा सकें।

Post-leak primitives (बग को फिर से ट्रिगर किए बिना)

Hole0 में एक shader के अंदर एक forged URB structure बनाएँ, फिर coalescing “move up” का उपयोग करके URB1 को forged data से replace करें।
URB को persistent बनायें: URB1.next = Hole0 सेट करें और refcount बढ़ाएँ; URB1 को reap करने से Hole0-backed fake URB FIFO head पर आ जाएगा। भविष्य के primitives केवल Hole0 के reallocations और नए fake URBs होंगे।
Arbitrary read: चुने हुए data_ptr और actual_len वाले fake URB को बनाकर reap करें ताकि host memory को guest में copy कराया जा सके।
Arbitrary write (32-bit): fake URB जिसका pipe controlled memory की ओर इशारा करता हो और UHCI TDBuffer writeback का दुरुपयोग करके किसी arbitrary address पर चुना हुआ dword स्टोर कराएँ।
Arbitrary call: USB pipe callback को overwrite करें; host इसे RCX+0x90 पर controlled data के साथ call करेगा। WinExec को dynamically resolve करें (guest-side से Kernel32 पढ़कर) और vmware-vmx के अंदर एक CFG-valid gadget के माध्यम से pivot करें जो args को RCX+0x100 से लोड करके WinExec("calc.exe") को_dispatch_ करे।

LFH timing side-channel to learn the initial bucket offset

Deterministic Ping-Pong के लिए LFH free-chunk offset जानना आवश्यक है (16 slots में से कौन सा slot पहले hit होगा)। VMware backdoor instruction (inl %%dx, %%eax) का उपयोग करें synchronous VMware Tools command vmx.capability.unified_loop के साथ और एक 0x4000-byte string, जो हर call पर दो 0x4000 allocations मजबूर करती है।
gettimeofday के जरिए 8 calls (16 allocations) का समय मापें; एक call में consistent spike दिखाई देता है जब LFH नया bucket बनाता है। एक अतिरिक्त allocation के साथ पुनःपरीक्षण करें: अगर spike वही index पर रहे तो offset odd है, अगर shift हो तो even; अन्यथा शोर के कारण पुनरारम्भ करें।
सावधानी: unified_loop अनफ्रीएबल सूची में unique strings स्टोर करता है, जिससे O(n) lookup overhead और शोर बढ़ता है, इसलिए side-channel को जल्दी converge करना चाहिए।

References

Synacktiv – On the clock: Escaping VMware Workstation at Pwn2Own Berlin 2025

Tip

AWS हैकिंग सीखें और अभ्यास करें:HackTricks Training AWS Red Team Expert (ARTE)
GCP हैकिंग सीखें और अभ्यास करें: HackTricks Training GCP Red Team Expert (GRTE) Azure हैकिंग सीखें और अभ्यास करें: HackTricks Training Azure Red Team Expert (AzRTE)

HackTricks का समर्थन करें

सदस्यता योजनाओं की जांच करें!

हमारे 💬 Discord समूह या टेलीग्राम समूह में शामिल हों या हमें Twitter 🐦 @hacktricks_live** पर फॉलो करें।**

हैकिंग ट्रिक्स साझा करें और HackTricks और HackTricks Cloud गिटहब रिपोजिटरी में PRs सबमिट करें।