Aller au contenu

Photo

GFF format details?


  • Veuillez vous connecter pour répondre
55 réponses à ce sujet

#26
Tierrie

Tierrie
  • Members
  • 13 messages
So I figured out that there's a few blocks (the wiki uses the term "struct" - but they are not a GFF Struct) of data in the raw data section. And that each field provides an offset from the beginning of this "block".

I cannot find the length of the previous block, nor the offset of the beginning of this "block" from the beginning of the raw data section. Anyone got a clue here?

Modifié par Tierrie, 24 décembre 2009 - 02:22 .


#27
tmp7704

tmp7704
  • Members
  • 11 156 messages

Tierrie wrote...

I cannot find the length of the previous block, nor the offset of the beginning of this "block" from the beginning of the raw data section. Anyone got a clue here?

For the top-most "struct" i.e. the one which is placed first in the Struct Array part of the file the offset of data in the Raw Data block is going to be 0. So to read the data for a field in this 'primary' struct you'll use offset of (0 + value of "index" part of the field description) into the Raw Data block.

Any other structure will be linked through content of one of the fields (such field will have a set flag of  0x40000000 as part of its Field Type). The point in Raw Data block where the data for such "child" struct begins will be specified either by the field itself, or in the list to which the field is pointing (depending if the field is marked as pointing to a list or just containing a single item) What particular struct "template" from the Struct Array it is is indicated by the Field Type part of the field description.

Modifié par tmp7704, 24 décembre 2009 - 07:25 .


#28
Tierrie

Tierrie
  • Members
  • 13 messages
My situation is specifically ECStrings. The FieldIndex is 0, 4, 8, 16, 44, and so forth. And since its ECStrings, the value in at these locations are pointers to a List of wide chars.

However, in the Fields, I cannot find the offset of the "child structures" from the beginning of the raw data block. I know that it begins at offset 8 (in this specific file) for the first "child structure" but due to the variable length of that "child structure", I don't know where the next one begins.

From reading your post, I understand that the field should contain the where the data for such "child struct" begins. But I cannot find where it is. The FieldIndex specifies the offset from the beginning of the "child struct". The Type specifies the Flags and BaseType/Id, and the Label is some number corresponding to the label (I guess).

Modifié par Tierrie, 24 décembre 2009 - 10:03 .


#29
Werefox009

Werefox009
  • Members
  • 18 messages
Something I noticed this morning is that the offset used for lists is actually from the start of the raw data block, not the structure. This appears to be different from standard values where the offset is from the start of the child structure and not the raw data block.

#30
Tierrie

Tierrie
  • Members
  • 13 messages
Offset used in lists or offsets for lists?

#31
Werefox009

Werefox009
  • Members
  • 18 messages
Offsets for lists. Here is a chunk of python that works for ECStrings

def parseString( filehandle, header, field, position ):
    # Make sure we are at the correct position
    filehandle.seek( position + field["INDEX"], os.SEEK_SET)
 
    # Determine the position of the list
    reference = struct.unpack("I", filehandle.read(4))

    # Confirm the list isn't empty
    if reference == 0xFFFFFFFF:
        return

    # Advance to the position of the list
    filehandle.seek( reference + header["RAW_DATA_OFFSET"], os.SEEK_SET)

    # Extract the size of the list
    size = struct.unpack("I", filehandle.read(4))

    # Get the string
    result = []
    for i in range(0, size):
        result.append( struct.unpack("2c", filehandle.read(2)) )

    return result

If you are unfamiliar with python, the struct.unpack(...)is just a method for working with binary files, not anything GFF specific. Position is the offset for the start of the current structure, field is just the field information for this particular list, and header is all the header information.

I hope the tags will work.

Modifié par Werefox009, 24 décembre 2009 - 10:20 .


#32
tmp7704

tmp7704
  • Members
  • 11 156 messages

Tierrie wrote...

My situation is specifically ECStrings. The FieldIndex is 0, 4, 8, 16, 44, and so forth. And since its ECStrings, the value in at these locations are pointers to a List of wide chars.

However, in the Fields, I cannot find the offset of the "child structures" from the beginning of the raw data block.

If it's specifically for ECStrings then from what i experienced you can just treat value specified by field as a reference i.e. straight offset from beginning of Raw Data block to the point where the string data begins (length of string followed by series of wide chars)

E.g. if the field index is 4 then you read a value from (start of your struct) + 4. The (start of struct) will be either 0 for the primary struct, or an offset defined by field which defined what struct you're dealing with, as explained earlier. The value you get from your read is an offset into Raw Data block (counting from the start of it) where you'll find the data for your string that is the length of it + characters.

Modifié par tmp7704, 24 décembre 2009 - 10:43 .


#33
Tierrie

Tierrie
  • Members
  • 13 messages
@Werefox, alright I see what you mean - and that's what I thought too at first.



But, in the .plo files, the reference is from the beginning of the "child struct" in the raw data. So



filehandle.seek( reference + child_struct_offset + header["RAW_DATA_OFFSET"], os.SEEK_SET)



Does your code work for 2nd and 3rd level structures?

#34
Werefox009

Werefox009
  • Members
  • 18 messages
It does for the test files I am currently working with, although I'm only checking .tcw, .gff and .tmsh. I'll see if I can find a .plo to test against as well

#35
Tierrie

Tierrie
  • Members
  • 13 messages

tmp7704 wrote...
If it's specifically for ECStrings then from what i experienced you can just treat value specified by field as a reference i.e. straight offset from beginning of Raw Data block to the point where the string data begins (length of string followed by series of wide chars)


The FieldIndex is an offset from the beginning of the "child struct". At that location is an integer value. And that value is a straight offsets from the beginning of the Raw Data block.

The issue I have is about the offset from the beginning of the "child struct". If I understand the posts here - the common consensus is that FieldIndex is an offset from the beginning of the Raw Data block.

It is not.

The FieldIndex is an offset from the beginning of the "child struct". And the "child struct"s are some offset from the beginning of the Raw Data block.

It is this offset that I cannot find.

#36
Tierrie

Tierrie
  • Members
  • 13 messages
To better illustrate my point - I'm including an excerpt from a .plo file.

http://www.code-poet...g/gffoffset.png

#37
Werefox009

Werefox009
  • Members
  • 18 messages
I have now tested a .plo file against my code, and yes, it does work. (After I corrected a slight issue with lists of a specific type of structure and ignored TLKStrings - eventually I will get that working but for the project I am working on I don't need them atm).



I think the issue is the fact that there are a couple of different offsets that are offset from different starting points. You use the field index offset from the start of the child structure to find the starting point within the structure raw data. If you have a ECString, you then use the next 4 unsigned bytes to find the reference offset from the raw data block to the actual ECString size (and then payload).



How do you find the child structs offset in the data block? you have no choice but to walk the structure fields until you find it. Remember, the structure at position 0 is always there and represents the file itself - so every other structure and field and list can be found by walking the fields within structure 0 until you find it.



If you aren't already trying to solve this problem by recursive descent (treating structures as nodes and fields as leaves), then you are in for a rough time imo.

#38
tmp7704

tmp7704
  • Members
  • 11 156 messages

Tierrie wrote...

The FieldIndex is an offset from the beginning of the "child struct". And the "child struct"s are some offset from the beginning of the Raw Data block.

Yes, this is correct for the fields which are part of the "child struct". Their index values are supposed to be added to the starting point of this "child struct" rather than the very beginning of the Raw Data block.

In orded to determine this starting point you have to check the content of field in "parent" of your "child struct". E.g. let's say your top-level structure is "plts" for the plots. A field in this structure states it is a link to a list of structs of type "plot". The list has following data: length of the list, and then the offset for 1st "child struct" and then optionally for the 2nd one etc.

... that said, trying to parse the .plo files in this manner fails rather badly for me -- i checked only 2 samples but the offset defined in the (just 1 element long) list of "child structs" was said to be 72, and then trying to read a string supposedly located at this offset 72 + field index of 4 resulted in values which were way too large to be offset to data of the string itself. Hopefully someone more savvy in the format details can figure out what's wrong here.

edit: well looks it's back to the drawing board. apparently lists of structs which also have reference flag set store their offsets for child structs as references from the beginning of Raw Data block... while regular lists of structs are doing something funny (i have list with values that go something like 0, 0, 0, 16, 0, 0 as example)  Clearly the format would be too simple without the extra convolutions like that Posted Image

edit2: wait, i'm an idiot. if the reference flag isn't set then a list of structs holds simply the content of these structs one after another, after the value which defines length of the list...

Modifié par tmp7704, 25 décembre 2009 - 04:26 .


#39
Tierrie

Tierrie
  • Members
  • 13 messages
@tmp7704, @Werefox009, do either of you mind sharing your code? Particular the part that traverses the structures?



@Werefox009, I do use recursion, but i think either I don't understand the answers you and tmp7704 are giving, or that I'm not asking the question the right way.



The most enlightening answer so far was



A field in this structure states it is a link to a list of structs of type "plot". The list has following data: length of the list, and then the offset for 1st "child struct" and then optionally for the 2nd one etc.




Here's what happened when I checked the list - the first struct has flags "list" and "struct" and "index=3". So I go to the 3rd struct and recur. Where's the list that contains the length of the list and the offset of the 1st "child struct" etc?

#40
tmp7704

tmp7704
  • Members
  • 11 156 messages

Tierrie wrote...

Here's what happened when I checked the list - the first struct has flags "list" and "struct" and "index=3". So I go to the 3rd struct and recur. Where's the list that contains the length of the list and the offset of the 1st "child struct" etc?

From what seems to work for me now (also with the *.plo files) is the following:

a field has flags of 'struct' and 'list'. The content of this field is a reference to beginning of actual list (reference is an offset from beginning of Raw Data block)

the list begins with amount of elements held by the list, stored as unsigned int. After this number, because the 'parent' field didn't have the 'reference' flag set, follows the data for the fields of your "child" structure of 'index 3'. If there's more than one element in the list then the data for 2nd 'child struct' begins right after the data for the 1st one ends, and so on. (or more precisely, this "step" is defined by the struct size, as stored in Struct Array)

if the reference flag was set on the parent field (the one which held address of this list) then after the number of elements would be instead of series of numbers, each number being a reference from beginning of Raw Data block to the beginning of one of your 'child structs'

edit: blah, the forums eat formatting from the code

Modifié par tmp7704, 25 décembre 2009 - 05:54 .


#41
Tierrie

Tierrie
  • Members
  • 13 messages
@tmp4407, Thank you. I got it.

#42
Werefox009

Werefox009
  • Members
  • 18 messages

Tierrie wrote...

@tmp7704, @Werefox009, do either of you mind sharing your code? Particular the part that traverses the structures?


I've shared up my initial test code here. I'm not currently working on this file, as I am tidying it up and turning it into proper classes at this point. However it does work for all the test files I have run against it so far - doesn't support TLKStrings though at this point.

I figured you probably were using recursion; although I would really like to see a linear solution - would be mind boggling code :P

[edit] I just notice an error in the file I uploaded - essentially the field type array that holds how to parse the field values is wrong for types above 13 so Color4f, Matrix4x4f  fields wont parse properly (ECStrings will as it doesn't use the array.) I probably should have uploaded the one i am working on, but its output is alittle confusing atm.

Modifié par Werefox009, 25 décembre 2009 - 07:33 .


#43
tmp7704

tmp7704
  • Members
  • 11 156 messages
Regarding TLKStrings: these appear to be just a separate version of type UINT64 -- a single unsigned long long with the value stored directly in the field. I'd guess the value of it is the ID of string in the (combined) talk table.

Modifié par tmp7704, 25 décembre 2009 - 07:57 .


#44
Tierrie

Tierrie
  • Members
  • 13 messages
TLKStrings are a struct consisting of 2 int32. the first int32 is an id of the string in the talk table. the second is the length of the string in wchars.

#45
tmp7704

tmp7704
  • Members
  • 11 156 messages
That seems weird for two reasons... first, all TlkStrings in *.plo files i checked had the "second int" set to 0 which would imply these strings are empty if that's indeed purpose of this bit of data. And second, wouldn't the length of string wary and be dependant on which language talk-table the game is using?

Modifié par tmp7704, 26 décembre 2009 - 02:38 .


#46
Werefox009

Werefox009
  • Members
  • 18 messages
I have done a little investigation into TLKStrings and this is what I found. From the offset located by adding the field index to the child structure offset.

4Byte Label, 4Byte Reference => 4 Byte Unsigned Size, 2 Byte Character list Size long.

The 4 Byte Reference is from the start of the Raw Data Block. I worked this out by renaming the .plo file to a .gff file and then opening it using the generic GFF file viewer supplied as part of the DAO toolset. This gave me several payloads to help work out the position in the hex editor.

So it is very similar to a ECString, except that there is an extra 4 byte label that is used for whatever reason, prior to the reference offset. This reference can be 0xFFFFFFFF for an empty reference.

I haven't added this into my tool yet so there is a chance its a bit more complex than that, was just wanting to see how difficult it would be to work out.

Modifié par Werefox009, 26 décembre 2009 - 03:27 .


#47
Werefox009

Werefox009
  • Members
  • 18 messages
The editor did mention something about shared references when i saved it out (specifically about the shared references no longer being shared) so there could be extra information after the reference offset (for sharing refernces betwen files)

Modifié par Werefox009, 26 décembre 2009 - 03:32 .


#48
tmp7704

tmp7704
  • Members
  • 11 156 messages
Hmm i must be checking some different kind of TlkStrings then, since they appear much simpler than that. As example, i open "plt_cod_hst_orz_key.plo" and the plot_name TlkString there gives me ID of 368782. (the generic viewer from the Toolset reports the same value as content of this TlkString, no matter if the file is renamed or not) Now i open the talktable "singleplayer_en-us.tlk" and the content of string with StrRef of 368782 there is "The Key to the City". This seems logical enough..?

#49
Werefox009

Werefox009
  • Members
  • 18 messages
plt_gen00pt_attributes.plo was the file I was looking at, I never looked at any .tlk files at all.

#50
tmp7704

tmp7704
  • Members
  • 11 156 messages
Checking the "plt_gen00pt_attributes.plo" file gives me TlkStrings with very low values -- things like "3" or "21" stored in memory as "3, 0" and "21, 0" respectively (when taken 4 bytes at a time) That'd mean the 'reference' part in your interpretation is '0' for multiple strings (for that matter, was 0 for all strings i've checked so far)..?



That said, these IDs are so low i couldn't find any talk table have them included so can't realy check whether my interpretation works for these, either.