Sunday, September 2, 2012

Android and boot animation

Just wanted to check how such fancy view been created for Android 4x , Ice Cream Sandwich or ICS:


Quick investigations shown that general boot process for Android-based devices looks like this : linux:boot:android

Just notice such step as:
bootanimationShows the animation during boot-up
ok, after short lookup in the source, I could find one important file, like BootAnimation.cpp

You can see all details inside but in fact - not much OpenGL ES been used, beside showing textures over time as animation frames, see bool BootAnimation::threadLoop()
from the link above.

OpenGL ES usage is mostly all about following calls:
with doing animation sequence itself inside bool BootAnimation::movie()
with even more simple calls.

Sad story, I was under impression that boot animation is dynamic in the full sense and rendered by OpenGL ES 

... well, just was a hope ...


Besides, you can see where animation located and now it will be easy to replace for the own one, if you would wish..

example of the bootanimation.zip is here: http://www.mediafire.com/?663k1abf2d1wy2m
or just google for such raw sample for your best look...


( you can replace default with yours with adb command like:
adb push ./bootanimation.zip /data/local/bootanimation.zip 

see in sources - User's boot animation has higher priority vs system's one,
unless encrypted system is used AND present )


Is my mobile emailing secure?

Just wanted to really check if my emails from my iphone usage can be sniffed meaning be visible to someone if I would use some WiFi hotspot, 
like when traveling...

In order to check this, had to establish my small test environment, like very good described here: Capturing iPhone traffic with Wireshark

Now different techniques possible but I wanted to do it fast, lazy me...

So I opened captured with Wireshark packages and searched for "yahoo" and "google" as email services I am using:


Now about findings:
- For yahoo , I see certificate exchange and more:









- For gmail, also:









and sadly ( joking ) - nothing unencrypted/unsecured about my email text, email contacts etc

Well, feel myself safe , 
or ?


PS: nothing especially been made for default email settings,

However, for gmail account: "Using of SSL" set to ON,
Yahoo mail on iphone currently has no such setting.

Sunday, June 3, 2012

Video decode on CPU vs HW / iOS and battery

Just had a fun to look deep at some video stream app on iOS devices, it is a nice app as it allows to play variety of video stream and not only "supported by Apple" ones (meaning hardware/HW/GPU way of video decode and playback)

Under the hood: it uses VLC parts and bits together with the trick as:
- decode on CPU (arm) / via standard VLC code
- video output via OpenGL stack / iOS SDK


This approach allows to use AirPlay also and quite handy
even thought with some setup limitations:
like you cannot establish a proper connection (called "Mirroring") from withing the app but on the system level (yet restricted by Apple down to the Private API, where such usage will make the app not acceptable for AppStore-based distribution )

where in general: You can add an AirPlay picker to your media playback controls by using MPVolumeView...

And the last note: I have a strong feeling that such approach drain my battery like 2-4x times faster, if I would compare it to iOS default or presumably HW based video playback.

Not sure if I really like it.



Thursday, May 31, 2012

OpenCL 1.2 and OpenCL binaries

it looks that I was a bit sleepy ( or busy? ) and overlooked may be first OpenCL 1.2 implementation been released, from AMD
practically many expected changes and described at many places, 

just may be to add from my side, a diff view:
note claimed AVX support.

My kernels work.

Now about interoperability of binary kernels: right nobody was promising much, especially in between vendors, like for now:
- AMD: uses binary and kind of ELF format 

- Intel: uses binary and kind of LLVM bitcode format with minor additions ( to prevent fast disassemble or just an old LLVM (2.8, 2.9 ... )
 ( note: pattern of 0x42 0x43 0xC0 0xDE  starting from offset 0x18, property of bc format, magic number aka BC 0xC0DE )


- Nvidia: uses plain text NVVM IR/PTX format

Monday, April 30, 2012

Ламповый усилитель для наушников

Вроде и пробывал, потом перепробывал много наушников, однако вот эти: Sony-MDR-V700DJ-DJ и служат долго ( первые купил лет 8-10 назад и еще "служат") и привык очень давно. Нравится не только то что закрытые - и народ/семья вокруг не ругается, но и просто качество, хотя фирменный штекер уже перепаял, но это сам дурак...
Кстати - вполне раскачиваются и мелкими устройствами типа iphone.
Так что стали "друзьями в поездках" ибо очень помогает при перелетах отключиться от внешнего мира и ... просто поспать :)

Но соль этого поста в другом - сбывается мечта идиота и скоро ко мне добереться усилитель вот такого содержания - Усилитель на 6ж51п
Все просто, но лампово. Надо конечно насчет DACа правильного подумать, что бы и как бы все красиво - но "города не сразу строятся".
Вообщем - ту би континюед.

Thursday, April 26, 2012

Compressed Video Stream Analyzers

It happens that for any check and analysis of video stream for its very low details and compliance as for h.264/AVC and more standards - not many tools available.

Let me count (without any priority):
and not sure if missed any other one. 

Are they any useful? 

To the good extend - they are unique in the functionality!
How you going to prove that your implementation is good and stable enough without such tools?

I've been getting very good experience with  StreamEye as it helped to solve several very tricky bugs related to NAL's, SEI's, where some players were able to decode and show encoded streams properly but not the others, Windows Media Player was refusing to play but QuickTime...



It will be interesting what will happen with the next standard - "H.265"





Monday, April 23, 2012

Неизвестный Синклер

не знаю: верить или нет, это уже на плечах каждого прочитавшего.

Мне лично, в силу каких то причин, больше довелось "возиться"  с "Специалистом", даже программы были написаны , но потеряны...


Вообщем, цитирую:
"Сэр Клайв Синклер после выпуска ZX-80 и ZX-81 изобретает замечательный компьютер ZX-82, который позднее назвали ZX-Spectrum. В апреле 1982 года он выходит в свет. Компьютер пользуется большой популярностью, и Синклер выпускает в 1983 году Микродрайв и Интерфейс 1. Эти устройства позволяют работать с данными быстрее, чем на кассетах. Кроме того, там есть последовательный порт (RS232) и возможность объединять компьютеры в сеть. В 1984 году появляется вариант с улучшенной клавиатурой и радиатором в блоке питания, который называется ZX Spectrum +. И, наконец, в 1986 году выходит Sinclair Spectrum 128, в котором 128 килобайт памяти, но можно переключиться и в режим с 48 килобайтами. У него также новая операционная система со встроенным калькулятором и расширенным Бейсиком. Но в 1986 году Sinclair Research LTD обанкротился. Права на производство спектрумов приобретает «Амстрад». С 1986 до 1988 года «Амстрад» выпускает три модели: Amstrad Spectrum +2, +2A и +3 с дисководом."

Маловато информации для более чем шестилетнего периода в развитии одного из самых массовых и легендарных компьютеров. К тому же история получилась, мягко говоря, неточной: в момент выпуска ZX80 (именно так выглядит его точное название) Клайв Синклер не был «сэром», то есть не имел рыцарского звания. Как не имел его и двумя годами позднее на момент выпуска ZX Spectrum. К тому же Синклер ничего не изобретает, он лишь возглавляет фирму с ограниченной ответственностью, занимающуюся разработкой (по множеству направлений) с привлечением других компаний.


и дальше в деталях, по годам

Sunday, April 22, 2012

Лжец — социальный навык (с) Dwarf Fortress wiki

"Лжецсоциальный навык, используемый дварфами чтобы заводить друзей. Дварфам с развитым навыком проще завести друга и выше шансы стать мэром." 
Взято отсюда: http://www.dfwk.ru/Liar

вообщем даже не знаю, что можно добавить, забавно...

Хотя, Dwarf Fortress это конечно еще одно доказательство что ли, что программирование для игры - дело второе, третье, 
главное же - идея.

Сложность игры поражает:
взято тут

Как продолжение доказательста идеи: "идеи и только потом программирования" - можно посмотреть на второго ( первый - Dwarf Fortress, да , именно так,IMHO ) вдохновителя Minecraft'a - Infiniminer'a и его исходники.
вообщем то и смотреть то мало есть на что, со стороны исходников (может быть только CaveGenerator и в частности: использование градиента и экстраполяций в создании пещеры, все остальное - использование XNA )
но эффект от реализации проэктов - впечатляет,
Minecraft: 
"Несмотря на то, что игра находилась в стадии бета-версии и у неё не было никакой рекламной кампании, по состоянию на 30 марта 2012 года число зарегистрированных пользователей превысило 25 миллионов, из которых более 5 миллионов игроков купили игру"

Wednesday, March 28, 2012

Big bosses coming? Вам не сказали того что вы ожидали?

Тогда не отчаивайтесь, это случается часто и повсеместно,

Случайно я даже нашел сходство с текстом одной и всем известной песни, какой?

Слушайте:

смотреть не обязательно :)

Friday, March 16, 2012

Resolving of true method via objc_msgSend and within IDA for arm binaries

As you know Objective-C is full (80%+) of calls which are made through the help of  the internal objc_msgSend method.
This is not a problem unless you would like to make some reversing for a good reasons :)
and at this point, knowing what method is actually called - kind of a key.

Not a problem any more, just have a look on helper IDC script for IDA which makes situation more transparent and works directly over arm binaries, so it makes from kind of unknown:
__text:000036B8 02 F0 92 EC                       BLX             _objc_msgSend

something more obvious for the method call name:
__text:000036B8 02 F0 92 EC                       BLX             _objc_msgSend ; @selector(getVertexSize)

Just note method name where app will land at.

IDC script in sources available at: https://github.com/x264msna/dearm_msgSend

Sunday, March 11, 2012

Selling copy of ... sample, ripple HD (?) effect

Ripple effect looks quite interesting and there are some questions how to do such.
Apple has released the own sample, available from GLCameraRipple and this is not the end of the story: some guys have decided to sell the sample and more - it has been accepted by Apple and App Store, just here - Ripple HD
(verified by shaders :) - 100% the same, including "Copyright (C) 2011 Apple Inc. All Rights Reserved." )

PS if you are interested in "how to implement?" - just look for the full sources at the sample page from Apple, YUV -> RGB is there and more, like simulation of UVs via
runSimulation and rippleTexCoords

Nothing to say more from the technical side of the Ripple Effect implementation question.

Monday, February 27, 2012

философия,корпорация или корпорация,философия

Наверное тяжело связать два термина однако смотрим и не обманываемся:

смотреть со 2й минуты, все политкоректно и дружелюбно...

интересно, если ли еще подобные рассуждения вслух?

Saturday, February 11, 2012

Understanding Verilog Blocking and Non-blocking Assignments

One of the best and detailed explanation of such important and to the good extend, complex topic - so far, been found here: Understanding Verilog Blocking and Non-blocking Assignments

quite old from the date but so useful in many aspects, must read.

Tuesday, January 24, 2012

Bash'ing in Parallel

When you need to process many items (like I had to process for my video from the previous post - 4817 initial pictures) you would better think about how much time it might take. In my example, even knowing that 4817 pictures with such perfect but not yet well threaded to use all CPU cores tool as imagemagick takes about 2 second per picture , total time is kind of nightmare to wait for.
So, as I am processing on Linux and with script - here is the trick to utilize all local and remote CPU cores even from BASH script and(!) minimum changes required:
Instead of traditional for many years:

for i in `find . -type f -name "file*.png"`;
do 
do_view_port.sh `echo $i | sed -e 's/\.png//' -e 's/^.*_//'`
done 

Where I process all .png files in the folder and one by one with my do_view_port.sh, "magic" script with needed actions, just get GNU Parallel (which is IMHO best by now) and do very minor changes:


find . -type f -name "file*.png"  | sed -e 's/\.png//' -e 's/^.*_//' |
parallel -j+0 --eta do_view_port.sh {.}

where it start to be cleaner for the look and faster ( in my case: 7.65x ).
Just note {.} which represents a "current" arguments...

So and in total - this, kind of one change, allowed me to finish needed changes within as much as 23min (instead of 175min or close to 3 hours ) 

PS: I personally liked this free ETA, as always, meaning time yet to go/Estimated Time to Achieve :)

Sunday, January 22, 2012

breaking picture down

Всегда радовали глаз японские игры, а особенно их эффекты, я бы даже сказал спец-эффекты...
Как то было даже интересно посмотреть как именно они реализовывают эти свои цвета радуги, но не доходили руки...
Однако дошли и до очень IMHO неплохой вещи - Devil May Cry, ее 4й части. 
Оказалось particle машина у них, японцев работает во всю и даже чуть больше,шире и с большим набором красок, включая ядовито-какие-то...
Много конечно реализовывается с помощью "правильно продуманых" mesh-ей и дальше уже шейдеры - например полет мечя вверх лихо повторяется подготовлеными mesh-ами, причем в несколько проходов...
Использование ессно alpha и немного другого зелья...

Как то все это и организовалось в небольшое видео: 


Как именно все подготавливалось, прорисовывалось и пост-обрабатывалось скорее всего выльется в отдельных пост, там и просто и по-разному...


Надо будет еще что то поразбирать "в винтики"...

Update: последний в видео пост-эффект, по-сути, это ессно довольно простой pixel/fragment shader:

// Buffer Definitions: 
//
// cbuffer FilterBlur
// {
//
//   float gXfBlurStart;                // Offset:    0 Size:     4
//   float gXfBlurWidth;                // Offset:    4 Size:     4 [unused]
//   float2 gXfScreenCenter;            // Offset:    8 Size:     8
//   float gXfAlpha;                    // Offset:   16 Size:     4
//
// }
//
//
// Resource Bindings:
//
// Name                   Type  Format         Dim Slot Elements
// ---------------- ---------- ------- ----------- ---- --------
// PointSampler0       sampler      NA          NA    0        1
// LinearSampler1      sampler      NA          NA    1        1
// PointSampler0TEXTURE    texture   float          2d    0        1
// LinearSampler1TEXTURE    texture   float          2d    1        1
// FilterBlur          cbuffer      NA          NA    0        1
//
//
//
// Input signature:
//
// Name             Index   Mask Register SysValue Format   Used
// ---------------- ----- ------ -------- -------- ------ ------
// SV_POSITION          0   xyzw        0      POS  float       
// TEXCOORD             0   xy          1     NONE  float   xy  
// TEXCOORD             1     zw        1     NONE  float       
//
//
// Output signature:
//
// Name             Index   Mask Register SysValue Format   Used
// ---------------- ----- ------ -------- -------- ------ ------
// SV_TARGET            0   xyzw        0   TARGET  float   xyzw
//
ps_4_0
dcl_input linear v1.xy
dcl_output o0.xyzw
dcl_constantbuffer  cb0[2], immediateIndexed
dcl_sampler s0, mode_default
dcl_sampler s1, mode_default
dcl_resource_texture2d ( float , float , float , float ) t0
dcl_resource_texture2d ( float , float , float , float ) t1
dcl_temps 2
add r0.xy, v1.xyxx, -cb0[0].zwzz
mad r0.xy, r0.xyxx, cb0[0].xxxx, cb0[0].zwzz
sample r0.xyzw, r0.xyxx, t1.xyzw, s1
sample r1.xyzw, v1.xyxx, t0.xyzw, s0
add r0.xyzw, r0.xyzw, -r1.xyzw
mad o0.xyzw, cb0[1].xxxx, r0.xyzw, r1.xyzw
ret

Thursday, January 12, 2012

Timing fun: timeBeginPeriod/timeEndPeriod

I will start from: The multimedia timer services allow an application to schedule periodic timer events — that is, the application can request and receive timer messages at application-specified intervals.
it is quite interesting to see quite interesting limitation:
You must match each call to timeBeginPeriod with a call to timeEndPeriod, specifying the same minimum resolution in both calls. An application can make multiple timeBeginPeriod calls as long as each call is matched with a call to timeEndPeriod.

from timeBeginPeriod MSDN description and a bit more:
You must match each call to timeBeginPeriod with a call to timeEndPeriod, specifying the same minimum resolution in both calls. An application can make multiple timeBeginPeriod calls as long as each call is matched with a call to timeEndPeriod.

from timeEndPeriod MSDN description

Funny, right? "Must much each call...the same minimum resolution in both call ..."
Let me show details that will help you to understand why and it details:
(disassembled but C-liked, by HexRay - not everything been changed but only a major logic )
MMRESULT __stdcall timeBeginPeriod(UINT uPeriod)
{
  UINT v1; // esi@1
  char *v2; // eax@2
  __int16 v3; // cx@3
  int v4; // eax@8
  MMRESULT v5; // esi@10
  MMRESULT result; // eax@11

  v1 = uPeriod;
  if ( uPeriod < TDD_MAXRESOLUTION )
  {
    result = 97;
  }
  else
  {
    JUMPOUT(uPeriod, dword_41B28FE4, loc_41B0A0A1);
    EnterCriticalSection(&ResolutionCritSec);
    v2 = (char *)&word_41B28FF6[v1 - TDD_MAXRESOLUTION];
    if ( *(_WORD *)v2 == -1 )
    {
      LeaveCriticalSection(&ResolutionCritSec);
      result = 97;
    }
    else
    {
      v3 = *(_WORD *)v2 + 1;
      *(_WORD *)v2 = v3;
      if ( v3 != 1 || v1 >= saved_value_2 )
      {
        v5 = 0;
      }
      else
      {
        if ( WPP_GLOBAL_Control != &WPP_GLOBAL_Control
          && *((_DWORD *)WPP_GLOBAL_Control + 7) & 0x400000
          && *((_BYTE *)WPP_GLOBAL_Control + 25) >= 5u )
          WPP_SF_P(
            *((_DWORD *)WPP_GLOBAL_Control + 4),
            *((_DWORD *)WPP_GLOBAL_Control + 5),
            16,
            (int)dword_41B02720,
            v1);
        v4 = 10000 * v1;
        uPeriod = 10000 * v1;
        if ( 10000 * v1 < MinimumTime )
        {
          v4 = MinimumTime;
          uPeriod = MinimumTime;
        }
        if ( NtSetTimerResolution(&uPeriod, v4, 1, &uPeriod) < 0 )
        {
          if ( WPP_GLOBAL_Control != &WPP_GLOBAL_Control && *((_DWORD *)WPP_GLOBAL_Control + 7) & 0x400000 )
          {
            if ( *((_BYTE *)WPP_GLOBAL_Control + 25) >= 1u )
              WPP_SF_P(
                *((_DWORD *)WPP_GLOBAL_Control + 4),
                *((_DWORD *)WPP_GLOBAL_Control + 5),
                17,
                (int)dword_41B02720,
                v1);
          }
          --word_41B28FF6[v1 - TDD_MAXRESOLUTION];
          v5 = 97;
        }
        else
        {
          saved_value = v1;
          v5 = 0;
          saved_value_2 = (uPeriod + 9900) / 0x2710;
        }
      }
      LeaveCriticalSection(&ResolutionCritSec);
      result = v5;
    }
  }
  return result;
}



//----- (41B09FEB) --------------------------------------------------------
MMRESULT __stdcall timeEndPeriod(UINT uPeriod)
{
  UINT v1; // esi@1
  char *v2; // eax@3
  __int16 v3; // cx@4
  int v4; // ecx@9
  MMRESULT v5; // esi@10
  MMRESULT result; // eax@11
  char v7; // [sp+4h] [bp-4h]@9

  v1 = uPeriod;
  if ( uPeriod < TDD_MAXRESOLUTION )
  {
    result = 97;
  }
  else
  {
    if ( uPeriod >= dword_41B28FE4 )
    {
      result = 0;
    }
    else
    {
      EnterCriticalSection(&ResolutionCritSec);
      v2 = (char *)&unk_41B28FF6 + 2 * (v1 - TDD_MAXRESOLUTION);
      if ( *(_WORD *)v2 )
      {
        v3 = *(_WORD *)v2 - 1;
        *(_WORD *)v2 = v3;
        if ( !v3 && v1 == saved_value )
        {
          while ( v1 < dword_41B28FE4 && !*(_WORD *)v2 )
          {
            ++v1;
            v2 += 2;
          }
          NtSetTimerResolution(dword_41B28FE4, 10000 * saved_value_2, 0, &v7);
          saved_value_2 = dword_41B28FE4;
          saved_value = v1;
          if ( v1 < dword_41B28FE4 )
          {
            if ( WPP_GLOBAL_Control != &WPP_GLOBAL_Control
              && *((_DWORD *)WPP_GLOBAL_Control + 7) & 0x400000
              && *((_BYTE *)WPP_GLOBAL_Control + 25) >= 5u )
              WPP_SF_P(
                *((_DWORD *)WPP_GLOBAL_Control + 4),
                *((_DWORD *)WPP_GLOBAL_Control + 5),
                20,
                (int)dword_41B02720,
                v1);
            if ( NtSetTimerResolution(v4, 10000 * v1, 1, &uPeriod) < 0 )
            {
              if ( WPP_GLOBAL_Control != &WPP_GLOBAL_Control && *((_DWORD *)WPP_GLOBAL_Control + 7) & 0x400000 )
              {
                if ( *((_BYTE *)WPP_GLOBAL_Control + 25) >= 1u )
                  WPP_SF_P(
                    *((_DWORD *)WPP_GLOBAL_Control + 4),
                    *((_DWORD *)WPP_GLOBAL_Control + 5),
                    21,
                    (int)dword_41B02720,
                    v1);
              }
            }
            else
            {
              saved_value_2 = (uPeriod + 9999) / 0x2710;
            }
          }
        }
        v5 = 0;
      }
      else
      {
        v5 = 97;
      }
      LeaveCriticalSection(&ResolutionCritSec);
      result = v5;
    }
  }
  return result;
}

note several things:
- usage of saved_value ( and saved_value_2 )
- note usage of TDD_MAXRESOLUTION and error returns details
- an implicit usage of EnterCriticalSection to be good thread save
(will skip the rest as been less relevant for now)

you already noticed usage of
if ( !v3 && v1 == saved_value )
inside timeEndPeriod, right :)?

That would describe and answer: "Must much each call...the same minimum resolution in both call ..."
IMHO, purely architectural issue...

Now about timeGetDevCaps function to determine the minimum and maximum timer resolutions supported by the timer servicesand TDD_MAXRESOLUTION,
well, its code shows everything faster than I would ever describe:
MMRESULT __stdcall timeGetDevCaps(LPTIMECAPS ptc, UINT cbtc)
{
  MMRESULT result; // eax@3

  if ( cbtc >= 8 && ptc )
  {
    ptc->wPeriodMin = TDD_MAXRESOLUTION;
    ptc->wPeriodMax = 1000000;
    result = 0;
  }
  else
  {
    result = 97;
  }
  return result;

PS: WineHQ does the things differently...

PS2: With HexRay: some time (coffee break) for HexRay processing, 10 sec of looking and understanding full story
Assembler, w/o HexRay: 1 min (mostly scrolling back and forth) - so 6x slower? :)

Saturday, January 7, 2012

Deferred Shading by OpenGL

It is quite popular to use deferred shading and not standard shading these days. A lot of the details already described well with pros and cons.
With many lights and vertices - worth to check and I have seen already 5.8x better results for deferred approach: Def: 480FPS vs Std: 84FPS for 1081344 vertices
Based on the model(s): Harley Quinn contains a bit more than 36K vertices alone.
Where from the OpenGL side, differences are (Left is deferred shading): ExamDiff Pro Diff Report
First Text Fragment
Second Text Fragment
1 //Pass 1
2 //Draw the geometry, saving parameters into the buffer
3
4 //Make the pbuffer the current context
5 pbuffer.MakeCurrent();
6
7 //Clear buffers
8 glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
9 glColor4f(1.0f, 1.0f, 1.0f, 1.0f);
10
11 glLoadIdentity();
12 gluLookAt( 0.0f, 4.0f, 3.0f,
13 0.0f, 0.0f, 0.0f,
14 0.0f, 1.0f, 0.0f);
15
16 //Bind and enable vertex & fragment programs
17 glBindProgramARB(GL_VERTEX_PROGRAM_ARB, deferredShadingPass1VP);
18 glEnable(GL_VERTEX_PROGRAM_ARB);
19
20 glBindProgramNV(GL_FRAGMENT_PROGRAM_NV, deferredShadingPass1FP);
21 glEnable(GL_FRAGMENT_PROGRAM_NV);
22
23 //Draw the torus knot
24 glDrawElements(GL_TRIANGLES, torusKnot.numIndices, GL_UNSIGNED_INT, (char *)NULL);
25
26 //Draw the "floor"
27 glNormal3f(0.0f, 1.0f, 0.0f);
28 glBegin(GL_TRIANGLE_STRIP);
29 {
30 glVertex3f( 5.0f,-0.5f, 5.0f);
31 glVertex3f( 5.0f,-0.5f,-5.0f);
32 glVertex3f(-5.0f,-0.5f, 5.0f);
33 glVertex3f(-5.0f,-0.5f,-5.0f);
34 }
35 glEnd();
36
37 glDisable(GL_VERTEX_PROGRAM_ARB);
38 glDisable(GL_FRAGMENT_PROGRAM_NV);
39
40 //Copy the pbuffer contents into the pbuffer texture
41 glBindTexture(GL_TEXTURE_RECTANGLE_NV, pbufferTexture);
42 glCopyTexSubImage2D(GL_TEXTURE_RECTANGLE_NV, 0, 0, 0, 0, 0,
43 pbuffer.width, pbuffer.height);
44
45 //Make the window the current context
46 WINDOW::Instance()->MakeCurrent();
47
48 //Pass 2
49 //Draw a quad covering the region of influence of each light
50 //Unpack the data from the buffer, perform the lighting equation and update
51 //the framebuffer
52
53 //Set orthographic projection, 1 unit=1 pixel
54 glMatrixMode(GL_PROJECTION);
55 glPushMatrix();
56 glLoadIdentity();
57 gluOrtho2D(0, WINDOW::Instance()->width, 0, WINDOW::Instance()->height);
58
59 //Set identity modelview
60 glMatrixMode(GL_MODELVIEW);
61 glPushMatrix();
62 glLoadIdentity();
63
64 //Disable depth test
65 glDisable(GL_DEPTH_TEST);
66
67 //Bind the pbuffer texture
68 glBindTexture(GL_TEXTURE_RECTANGLE_NV, pbufferTexture);
69
70 //Bind and enable fragment program
71 glBindProgramNV(GL_FRAGMENT_PROGRAM_NV, deferredShadingPass2FP);
72 glEnable(GL_FRAGMENT_PROGRAM_NV);
73
74 //Loop through the lights
75 for(int i=0; i<numLights; i)
76 {
77 //Calculate the rectangle to draw for this light
78 int rectX, rectY, rectWidth, rectHeight;
79
80 lights[i].GetWindowRect(WINDOW::Instance()->width, WINDOW::Instance()->height,
81 viewMatrix, currentTime, cameraNearDistance,
82 cameraFovy, cameraAspectRatio,
83 rectX, rectY, rectWidth, rectHeight);
84
85 //Enable additive blend if i>0
86 if(i>0)
87 {
88 glBlendFunc(GL_ONE, GL_ONE);
89 glEnable(GL_BLEND);
90 }
91
92 //Send the light's color to fragment program local parameter 0
93 glProgramLocalParameter4fvARB( GL_FRAGMENT_PROGRAM_NV, 0, lights[i].color);
94
95 //Send 1/(light radius)^2 to fragment program local parameter 1
96 float inverseSquareLightRadius=1.0f/(lights[i].radius*lights[i].radius);
97 glProgramLocalParameter4fARB( GL_FRAGMENT_PROGRAM_NV, 1,
98 inverseSquareLightRadius, inverseSquareLightRadius,
99 inverseSquareLightRadius, inverseSquareLightRadius);
100
101 //Send the light's position to fragment program local parameter 2
102 glProgramLocalParameter4fvARB( GL_FRAGMENT_PROGRAM_NV, 2,
103 VECTOR4D(lights[i].GetPosition(currentTime)));
104
105 //Draw the rectangle
106 glBegin(GL_TRIANGLE_STRIP);
107 {
108 glVertex2i(rectX, rectY);
109 glVertex2i(rectX rectWidth, rectY);
110 glVertex2i(rectX, rectY rectHeight);
111 glVertex2i(rectX rectWidth, rectY rectHeight);
112 }
113 glEnd();
114 }
115
116 //Restore matrices
117 glMatrixMode(GL_PROJECTION);
118 glPopMatrix();
119 glMatrixMode(GL_MODELVIEW);
120 glPopMatrix();
121
122 glEnable(GL_DEPTH_TEST);
123 glDisable(GL_FRAGMENT_PROGRAM_NV);
124 glDisable(GL_BLEND);
125
1 //Make an initial pass to lay down Z
2 glColorMask(0, 0, 0, 0);
3
4 //Draw the torus knot
5 glDrawElements(GL_TRIANGLES, torusKnot.numIndices, GL_UNSIGNED_INT, (char *)NULL);
6
7 //Draw the "floor"
8 glNormal3f(0.0f, 1.0f, 0.0f);
9 glBegin(GL_TRIANGLE_STRIP);
10 {
11 glVertex3f( 5.0f,-0.5f, 5.0f);
12 glVertex3f( 5.0f,-0.5f,-5.0f);
13 glVertex3f(-5.0f,-0.5f, 5.0f);
14 glVertex3f(-5.0f,-0.5f,-5.0f);
15 }
16 glEnd();
17
18 glColorMask(1, 1, 1, 1);
19
20 //Bind and enable vertex & fragment programs
21 glBindProgramARB(GL_VERTEX_PROGRAM_ARB, standardShadingVP);
22 glEnable(GL_VERTEX_PROGRAM_ARB);
23
24 glBindProgramARB(GL_FRAGMENT_PROGRAM_ARB, standardShadingFP);
25 glEnable(GL_FRAGMENT_PROGRAM_ARB);
26
27 //Loop through the lights
28 for(int i=0; i<numLights; i)
29 {
30 //Calculate and set the scissor rectangle for this light
31 int scissorX, scissorY, scissorWidth, scissorHeight;
32
33 lights[i].GetWindowRect(WINDOW::Instance()->width, WINDOW::Instance()->height,
34 viewMatrix, currentTime, cameraNearDistance,
35 cameraFovy, cameraAspectRatio,
36 scissorX, scissorY, scissorWidth, scissorHeight);
37
38 glScissor(scissorX, scissorY, scissorWidth, scissorHeight);
39 glEnable(GL_SCISSOR_TEST);
40
41 //Enable additive blend if i>0
42 if(i>0)
43 {
44 glBlendFunc(GL_ONE, GL_ONE);
45 glEnable(GL_BLEND);
46 }
47
48 //Calculate the object space light position and send to
49 //vertex program local parameter 0
50 //Object space and world space are the same
51 glProgramLocalParameter4fvARB( GL_VERTEX_PROGRAM_ARB, 0,
52 VECTOR4D(lights[i].GetPosition(currentTime)));
53
54 //Send the light's color to fragment program local parameter 0
55 glProgramLocalParameter4fvARB( GL_FRAGMENT_PROGRAM_ARB, 0, lights[i].color);
56
57 //Send 1/(light radius)^2 to fragment program local parameter 1
58 float inverseSquareLightRadius=1.0f/(lights[i].radius*lights[i].radius);
59 glProgramLocalParameter4fARB( GL_FRAGMENT_PROGRAM_ARB, 1,
60 inverseSquareLightRadius, inverseSquareLightRadius,
61 inverseSquareLightRadius, inverseSquareLightRadius);
62
63 //Draw the torus knot
64 glDrawElements(GL_TRIANGLES, torusKnot.numIndices, GL_UNSIGNED_INT, (char *)NULL);
65
66 //Draw the "floor"
67 glNormal3f(0.0f, 1.0f, 0.0f);
68 glBegin(GL_TRIANGLE_STRIP);
69 {
70 glVertex3f( 5.0f,-0.5f, 5.0f);
71 glVertex3f( 5.0f,-0.5f,-5.0f);
72 glVertex3f(-5.0f,-0.5f, 5.0f);
73 glVertex3f(-5.0f,-0.5f,-5.0f);
74 }
75 glEnd();
76 }
77
78 glDisable(GL_VERTEX_PROGRAM_ARB);
79 glDisable(GL_FRAGMENT_PROGRAM_ARB);
80 glDisable(GL_SCISSOR_TEST);
81 glDisable(GL_BLEND);
Number of differences: 10
Added(5,22)
Deleted(0,48)
Changed(82)
Changed in changed(42)
Ignored

Shaders side will be later...

PS have you seen a good difference viewer ?

Wednesday, January 4, 2012

Parallel file load

In some cases and on some OS ( need to say - Linux )
parallel load of files to process, such as:
tbb::tick_count start = tbb::tick_count::now();

parallel_invoke( [&]() {preload(argv[1],first);},[&]() {preload(argv[2],second);} );

// preload #1 0.118449 seconds ,  :: parallel_invoke
// preload #2 0.130777 seconds ,  :: preload, preload
can be, as you can see, ~10% faster ( measured via tbb::tick_count, for sure )

Dont really want to go into more details but nice improvement and for almost no changes....