From version 8.00 onwards FTN95 includes a 64-bit compiler aswell as the long established 32-bit version.
64-bit code is produced by using the /64 compiler switch. Plato also has appropriate configuration options
for enabling 64-bit code production (which eventually boil down to it using the /64 compiler option).
It is not possible to mix 32 and 64-bit code. Compilation and linking is either all 32-bit or all 64.
Linking is done by the new 64-bit linker: slink64.
FTN95 creates 64-bit executables and DLLs when:
- the option /64 is used on the FTN95 command line,
- SLINK64 is used in place of SLINK
- salflibc64.dll and clearwin64.dll are used in place of salflibc.dll.
Differences between 32- and 64-bit Fortran
There are some differences between the 32 and 64-bit environments. These are summarised below:
- 64-bit programs run in a much larger address space. 32-bit FTN95 programs could, with some configuration, use 3GB of RAM.
64-bit programs are effectively limited to the amount of RAM, with hard disk backing, that you can give them.
- 64-bit programs cannot run on 32-bit versions of Windows.
- 64-bit programs cannot use 32-bit DLLs.
- Extended precision (REAL*10 and COMPLEX*20) is not available in 64-bit mode
- When distributing you programs you must also distribute the 64-bit run time DLLs. These are salflibc64.dll and clearwin64.dll
- 64-bit Microsoft Windows HANDLEs are addresses (64-bit integers). So if a Windows handle is used explicitly in Fortran code, it
will currently appear as a 32 (KIND=3) integer and must become a 64-bit (KIND=4) integer for 64-bit applications. FTN95 has a
special KIND value (7) that is interpreted as KIND=3 for 32-bit applications and KIND=4 for 64-bit applications. Alternatively
INTEGER(KIND=CW_HANDLE) can be used together with standard INCLUDE and MOD files because CW_HANDLE is defined as a parameter
with value 7. Windows HANDLEs are mainly used with %lc, %hw and some direct calls to the Windows API.
- There is currently no optimiser or check mode for 64-bit programs
- Assembler in CODE/EDOC is not compatible between 32 and 64-bit modes
This initial 64-bit full release does not allow you to combine /64 with /optimise nor /check (nor options that imply /check). It includes a beta version of the 64-bit debugger called SDBG64 that can be used together with /debug etc. on the FTN95 command line. Developers can still use /check etc. without /64 in order to test for run-time faults during development.
Extended precision (REAL*10) is not available when creating 64-bit applications.
SLINK64 can be used in:
a) command line mode
b) interactive mode or
c) script file mode
Here is an example of using command line mode...
FTN95 prog.f95 /64
FTN95 sub.f95 /64
SLINK64 prog.obj sub.obj /file:prog.exe
Here is an example of using interactive mode...
$ lo prog.obj
$ lo sub.obj
$ file prog.exe
Here is an example of using a script file...
where prog.inf contains...
For further information see below or type...
SLINK64 automatically scans commonly used Windows DLLs. If a Windows function located in (say) xxx.dll is
reported as missing then the DLL should be loaded by using a script command of the form
where C:\Windows illustrates the value of the %windir% environment variable.
Note that the initial release of SLINK64 can construct executables and DLLs but not static libraries.
The 64-bit debugger is provided as a beta release for users to test. It operates in essentially the same way
as the corresponding 32-bit debugger.
64-bit ClearWin+ was previously available for use with third party compilers via clearwin64.dll. This DLL has now
been extended for use with 64-bit FTN95. Users who have already adapted their code for use with third-party compilers
can continue to use their modified code. Alternatively native FTN95/ClearWin+ code can be used without change apart from the following exceptions:
- 64-bit Microsoft Windows HANDLEs are addresses (64-bit integers). So if a Windows handle is used explicitly in Fortran code, it will currently appear as a 32 (KIND=3) integer and must become a 64-bit (KIND=4) integer for 64-bit applications. FTN95 has a special KIND value (7) that is interpreted as KIND=3 for 32-bit applications and KIND=4 for 64-bit applications. Alternatively INTEGER(KIND=CW_HANDLE) can be used together with standard INCLUDE and MOD files because CW_HANDLE is defined as a parameter with value 7. Windows HANDLEs are mainly used with %lc, %hw and some direct calls to the Windows API.
- The function CLEARWIN_INFO@ now returns an INTEGER(KIND=7) value.
- 64-bit ClearWin+ does not currently support SIMPLEPLOT (%pl). Also a few very old graphics routines have not been ported to 64-bit ClearWin+.
- The function cpu_clock@ is not available for 64-bit applications and has been replaced by rdtsc_val@...INTEGER(KIND=4) FUNCTION RDTSC_VAL@()
Use the command line option /r for 64-bit applications and link the resulting .res file (together with the .obj files) via SLINK64.
Current experience suggests that using the "default.manifest" in a resource script causes the resulting 64-bit application to fail to load. However, a user supplied manifest file can improve the appearance. The text of a suitable manifest file is presented below.
A RESOURCES directive can be used at the end of a 64-bit Fortran main program but it only has effect when used with FTN95 command line options /LINK or /LGO. Otherwise a separate call to SRC is required. (For Win32 main programs, FTN95 automatically adds the resources to the main object file).
Silverfrost INCLUDE and MOD files
Silverfrost INCLUDE files have been modified so that Microsoft HANDLEs have type INTEGER(KIND=7).
Silverfrost MOD files can be used without change provided they are updated to those in this release.
Note that user FTN95 MOD files for 64-bit applications may differ from those for 32-bit applications. So FTN95 uses the
extension .mod64 for 64-bit MOD files whilst retaining teh extension .mod for 32-bit MOD files. The corresponding
object files always differ and the respective linker (SLINK or SLINK64) will reject object files of the wrong kind.
By default FTN95 uses the extension .obj for both 64-bit and 32-bit object files. For projects, both Plato and Visual
Studio retain the default extension and use a system of sub-folders in order to create executables for different
platforms (such as Win32, x64 and .NET) and for differenct configurations (such as Debug, CheckMate and Release).
Users who prefer to build their applications using batch and/or makefiles can adopt a similar sub-folder approach to
that used by Plato and Visual Studio. Alternatively, 64-bit object files can be given a different extension (e.g. .o64)
by using /BINARY (together with the object file name) on the FTN95 command line. In that way, 64-bit and 32-bit object
files could reside in the same folder.
The associated release of Plato is configured by default to use FTN95 when you select "Release x64" on the main toolbar. Previously this used gFortran. The default can be changed from the Options dialog.
Plato can launch the 64-bit debugger as an external application.
Like salflibc.dll, salflibc64.dll and clearwin64.dll can be freely redistributed with your applications and DLLs.
Additional notes on porting from 32-bit to 64-bit applications
1) When using the standard Fortran SIZE intrinsic, FTN95 with /64 returns a 64-bit integer despite the fact that this
is not strictly Standard conforming. In certain very special circumstances, this change can cause existing code to fail.
For example, failure will occur if SIZE(x) appears as the value of an argument to an overloaded subprogram (i.e. a
subprogram that has various definitions depending on the types of its arguments). A new command line option /SIZE32
is provided in order to resolve this conflict.
2) It is possible that there may a some slight loss of precision when porting from 32-bit to 64-bit calculations. This
is mainly because some FTN95 32-bit mode floating point calculations actually use hidden extended precision on the way
to producing double or single precision results. It is therefore possible that the process of porting to 64-bits may
expose a numerically unstable calculation (i.e. one that depends critically on the level of round-off error). In the same
way, in extreme cases it is possible that new exceptions may appear at runtime due to floating point overflow. Overflow
can occur directly or as the result of dividing by a value that has underflowed to zero. In some cases it is possible to
resolve these issues by using a scaling factor in the calculations.
Further information about SLINK64
The SINK64 command line
SLINK64 can be used in 3 ways...
1) It can use a series of commands from a file (recommended). The commands are placed in a file with the .inf or .link suffix, and is invoked thus:
2) It can be used interactively, using the same commands as in (1).
3) It can be used from the command line. This can be derived from the command specifications. Thus the command
lo <obj file> can be coded on the command line as /lo <obj file>
- load(lo) <file> - Loads the file, which must be FTN95/SCC 64-bit object code.
- map <file> - Requests a link map, to be placed in the specified file. If the file argument is omitted, the map is placed in a file whose name is derived from the name of the DLL or EXE file being created.
- file <exe or dll file> - Completes the linking operation and puts the result in the given file name. Note that the choice of suffix (DLL or EXE) determines the type of file created. Currently all entry points in the code are exported in the case of a DLL.
- windows - This command forces the creation of a WINDOWS application, which does not use a console. This is normally used in conjunction with ClearWin+ code.
- load(lo) <file.dll> - Uses the entry points in the specified DLL to satisfy calls in the code. The DLL must be avaiable at run-time.
- load(lo) <file.res> - Loads a resource file created with SRC using the /r switch. This is the same SRC command used in 32-bit mode except that the /r switch must be used.
- image_base <hex address> - Specifies the base address for the link (not normally required, and can be overwritten at run time by Windows).
- stack_size <hex number> - Specifies the stack size. The default value is 0x1000000 (16 MB).
- alias <name> <alias> - Sets up an alias to an external name when making a DLL. Note that the names are case sensitive. This was added to enable gFortran to call a DLL built with FTN95. It circumvents the problem that gFortran uses lower case names while FTN95 uses upper case names! It may have other specialised uses.
- help - Prints out abbreviated help information to the console.
- quit(q) - Quits SLINK64 without saving anything.
SLINK64 is automatically called when /link or /lgo is used on the FTN95 command line. The name of the
executable or DLL can be supplied after /link (this is optional for executables but mandatory for DLLs).
Also /stack can be included followed by the stack size as a number of megabytes. /map can also be used
in this context.
The WINAPP directive in the Fortran code creates a Windows application and this directive can optionally
be followed by the name of a resource script. Alternatively a resource script can be included by placing
the script after the main program by using the RESOURCES directive.
Here are the required SLINK64 commands for three slightly more complicated scenarios:
1) To link a simple program that uses a DLL:
The file (say) ExtraDLL.dll is scanned for entry points but it isn't incorporated in MyProgram.exe - so MyProgram.exe will require the DLL somewhere on the path at runtime.
2) To link a number of files to create a DLL that exports all subroutine/function names:
3) To create a windows program with some ClearWin+ code that uses resources:
The resources are prepared by:
SRC MyResources.rc /r
Then the slink commands are:
Further general information about 64-bit FTN95
Programs compiled with FTN95 using the /64 option, use the AMD64 instruction set (subsequently adopted by Intel, and
referred to as x64 or x86_64) which is almost universally available on modern PC's. This code cannot be mixed
with legacy 32-bit code, nor can it access legacy 32-bit DLL's. 64-bit object files must be linked using the new
utility SLINK64. This object file format is incompatible with all third-party link utilities.
The default size of INTEGER variables remains unchanged (231-1), so INTEGER*8 (8-byte) variables
must be used to index extremely large arrays. These variables are implemented in a more efficient and natural way
in 64-bits. Note that some arrays that would not fit in the old 4GB limit may still be indexable using default sized integers, for example a REAL*8 array of 2,000,000,000 elements would occupy nearly 16GB of memory, but could be
indexed using default integers.
The main value of 64-bit compilation is that the available address space has increased from 4GB to
approximately 1.8 x 1019 bytes! This means that for the foreseeable future (possibly forever!), the size of programs
will be limited only by the amount of physical memory available on a system.
Arrays that are ALLOCATEd, or which are in COMMON or in MODULEs can exceed the 4GB limit, except that initialised
arrays must fit within the .EXE or .DLL file to which they belong, and the the size of these files cannot extend beyond
the 4GB limit. This is a Microsoft limit, but is fairly reasonable, since the time needed to load a 4GB file would be excessive!
COMMON blocks and MODULE arrays are allocated dynamically as a program starts in order to enjoy no 4GB restrictions.
This is applied to all such storage blocks, because a program may exceed the 4GB limit even though each individual array lies within this limit.
Local arrays (static or dynamic) are restricted as in 32-bits. This is because it is not feasible to extend the hardware stack to sizes > 4GB, and SAVE'd variables must fit within the EXE or DLL file to which they belong. Users who require a very large local array, should put it in a COMMON block or MODULE referenced by only the one routine.
Since the code can be distributed across multiple DLL's plus an EXE file, the code itself is also not limited to
4GB - although this is not usually a serious concern.
The various 64-bit Windows operating systems provide less than the full 1.8 x 1019 address space, and the size
of this space varies somewhat with the available physical memory on the system. Nevertheless, these limits are very generous and will increase as physical memory becomes more plentiful. In part, these limits are due to the fact that the
paging mechanism itself requires memory.
For further information see https://msdn.microsoft.com/en-us/library/aa366778.aspx.
The pair of DLL's SALFLIBC64.DLL and CLEARWIN64.DLL in 64-bits take the place of the 32-bits SALFLIBC.DLL. Currently CLEARWIN64.DLL (which contains much more than ClearWin+) is compiled with Microsoft C++. In the future this may be absorbed into SALFLIBC64.DLL but will remain independent for use with third-party compilers.
Perhaps surprisingly, FTN95.exe and SLINK64.exe are 32-bit executables, and so still require access to SALFLIBC.DLL at compile time.
Note that the extra executables and DLL's to support 64-bit mode can coexist with those that support 32-bit operations
because they have different names.
Contents of a clrwin.manifest file...
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">
<requestedExecutionLevel level="asInvoker" uiAccess="false"/>
<assemblyIdentity type="Win32" name="Microsoft.Windows.Common-Controls" version="188.8.131.52"
processorArchitecture="*" publicKeyToken="6595b64144ccf1df" language="*"/>
64-bit CODE/EDOC in FTN95
The AMD 64-bit architecture
This architecture was invented by AMD, and was later adopted by by Intel when their own Itanium 64-bit architecture
was not received with enthusiasm. Intel use the term x86-64. It is the basis of most modern PCs, and is targeted by
FTN95 when the /64 switch is used.
The AMD 64-bit architecture has 16 general purpose integer registers:
RAX, RCX, RDX, RBX, RSP, RBP, RSI, RDI, R8, R9, R10, R11, R12, R13, R14, R15.
The bottom eight registers correspond to the 32-bit register set, and retain some of the same functionality.
Thus RSP is the stack pointer and descends as the stack expands, RCX, RSI and RDI are used for string operations just
as they are in 32-bits, and RAX is used by convention to return integer function values. RBP does not correspond in
function to EBP, however it is given a special function in Silverfrost code (explained later), and should not be modified
in normal circumstances.
All these registers hold 64-bits (8 bytes) and can therefore hold a pointer to anywhere in the 64-bit address space.
64-bit programs can access two sets of different floating point registers - the old floating point stack of eight 80-bit registers, and a set of registers designated XMM0 - XMM15, and known as the SSE registers. These registers can hold
multiple values simultaneously - foour REAL*4 floating point values, or two REAL*8 values. They can also hold integer values. Thus these registers are 16 bytes in width. These registers do not 'know' what data they contain - so it is up to the programmer to keep track. In particular, if you load a REAL*8 value into an XMM register and wish to store it as a REAL*4, you must first use the appropriate conversion instruction.
Strangely, the old coprocessor stack instructions, do offer some functionality that is not present in the newer SSE
instruction set - for example SIN and COS can be evaluated in one instruction.
Silverfrost CODE/EDOC conventions
Let us start with a simple executable example of a 64-bit CODE/EDOC sequence that simply sums a vector of REAL*8 values. It is not meant to be optimal because it does not use the parallel execution facilities of the SSE registers.
MOV_Q RDX,=VEC ! The '=' denotes a (non-immediate) constant or, as in this case, the address of an argument
MOV_Q R14,=N ! Remember all addresses are 64-bit - hence the use of MOV_Q
MOVSX_Q R14,[R14] ! Instructions and register names are case insensitive
! N is only a 32-bit integer, so it is sign extended to 64-bits
XORPD XMM0,XMM0 ! This is one way to zeroise an XMM register it does a bitwise exclusive OR
1 ADDSD XMM0,[RDX]
ADD_Q RDX,8 ! This uses an immediate constant
JNE $1 ! Labels are denoted by a '$'
MOVDQU [RCX],XMM0 ! Store away the accumulated answer in the argument ANS
This illustrates a variety of points
1) The instructions that operate on the integer registers can operate on 1, 2, 4, or 8 byte operands. These are
distinguished by a suffix, thus the MOV instruction takes the forms MOV_B, MOV_H, MOV, MOV_Q.
2) Unlike the 32-bit code/edoc, the register name does not change when the operation operates on a smaller number
3) Operations that work on 4 bytes of a register (MOV, ADD, etc) also clear the upper 4 bytes of the register,
whereas 2-byte and 1-byte instructions do not change the other bytes of the register. This is a feature of
the hardware, not a Silverfrost convention.
4) Labels are prefixed by a '$' when used, just as is the case in 32-bit mode.
5) When accessing a Fortran argument, you need to first access its address (an 8-byte quantity). The notation =N
is used to access the address of argument N. The '=' notation can also be used to address a constant in memory,
6) The MOVSX_Q instruction sign extends a 32-bit integer to 64-bits. In situations where a number is known to be
non-negative. This extension can be obtained for free using point 3 above.
In general a good way to learn to write instructions inside CODE/EDOC is to compile simple code samples with the /EXPLIST option, which will display the instructions generated by the compiler line by line in essentially the same format that you will use.
Referencing COMMON, MODULE, and ALLOCATE'd variables
Because most COMMON blocks are allocated as the program starts up (as are large arrays in MODULE's) the simplest way to
access these objects, as well as explicitly ALLOCATE'd arrays, is to take their address before entering the CODE/EDOC.
MOV_Q [R10+8],42 !This sets beta(2) to the value 42
The 64-bit address space
The 32-bit address space provided a theoretical maximum 232 (4 x 109) addressable bytes. Correspondingly,
the 64-bit address space offers a theoretical maximum 264 (1.8 x 1019) addressable bytes. This means that, rather like
in the early days of the 32-bit architecture, when a typical computer might have vastly less than 232 bytes (4 GB) of
memory, the virtual address space is only very sparsely populated.
Indeed, the 64-bit virtual address space is so large that it isn't possible to provide page tables to cover the
address space. This means that the amount of virtual address space available to a program is determined in a way
that depends on the version of Windows in use, and the total amount of main memory on the computer (say 16 GB).
This number is still extremely large. However, it is relevant if you use calls to VirtualAlloc to access high memory
addresses in an absolute way.
Using the SSE registers for parallel computation
Instructions like MOVDQA will load a pair of REAL*8 numbers into an XMM register. Since these numbers are just bits,
the instruction can also be used to move four REAL*4 numbers into an XMM register. However this instruction will fault
if the data is not 16-bit aligned. This is problematic because REAL*4 and REAL*8 numbers are aligned wherever possible
(EQUIVALENCE can prevent alignment) to 4 and 8 bytes respectively. In practice it turns out that the MOVDQU (which is
reputed to be slower than MOVDQA) seems to run at the same speed for aligned data, and only somewhat slower for non-aligned data, but generates no alignment faults.
It is also worth reading this discussion about alignment issues: http://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/