A product configuration note, while I'm thinking of it. It's important not to rely much on an installer. It's particularly bad to rely on installation of files in scattered directories, especially system directories. A user should be able just to run the app where ever he finds it: straight off the CD, or from a network location, or in a local directory installed by a different user or under a different OS. Remember that lots of users will be dual booting Win98 and Win2000 even a couple of years from now: we shouldn't require separate installations for both.
The Windows resource-updating API is documented here:
http://msdn.microsoft.com/library/psdk/winui/resource_3bcj.htm
http://msdn.microsoft.com/library/psdk/winui/resource_05yr.htm
http://msdn.microsoft.com/library/psdk/winui/resource_6coj.htm
http://msdn.microsoft.com/library/psdk/winui/resource_8f3n.htm
(I've
listed the pages separately because you can't navigate between them without a
Java-enabled browser.)
I spent most of today reorganizing the BeEmulation header files to (1) straighten out a mess that was preventing resolution of exported data (be_bold_font) and (2) make a reasonable system for longer-term use. My hope is to have a clear set of rules about what to #include in different situations. Here is what it is shaping up to be. In normal application code, we will to require only one additional header to build a non-BeOS version: BeEmulation.h. In the BeEmulation library itself, we will require that header plus one more: BeEmulationPrivate.h. In app code other than GP (e.g., the HelloWorld test app), we will require BeEmulation.h and an additional #define: DEFINE_GOBE_TYPES.
Finally got the whole BeOS
HelloWorld
app to compile and link, and started work on registering a win32 window class
and creating a win32 window within the emulation. Haven't yet figured out
exactly where to put the main WndProc,
but did figure out that BApplication::Run is going to have to do more than I've
got it doing now. It's currently just running the win32 message pump, but I
think it's also going to have to loop through and execute all known BLoopers.
That may mean I should move the win32 message pump somewhere else.
Also
simplified BeEmulation
header organization a little more: eliminated the need for #define
DEFINE_GOBE_TYPES by splitting a bunch of #defines out into a new GobeTypes.h.
I think I'm going to do the first draft of the core message handling parts of the BeEmulation DLL using a C-style wndproc, but using message crackers (see WindowsX.h) to keep it well organized and easy to maintain. After that is up and limping, it might be worth also experimenting with an MFC implementation. There is an explanation at http://msdn.microsoft.com/library/devprods/vs6/visualc/vccore/_core_alternatives_to_the_document.2f.view_architecture.htm of how to use MFC without using its document/view architecture. Using its document/view architecture is probably out of the question -- I can't imagine how we would make GP's BWindow/BView architecture work with that.
This morning I'm trying to solve some stupid little problem with getting a window created. I'm getting an error 2 (FileNotFound) from GetLastError after a failed CreateWindowEx. On the theory that the file it means is the app or dll image file, I thought maybe the problem might be with using the app instance handle in code that lives in the DLL, but changing to the DLL's hModule doesn't help. Maybe the problem is with where the wndproc lives.
Hmmm. The wndproc is getting called during this window creation process that's failing -- it gets a WM_CREATE, so it must be close. Am I not handling WM_CREATE correctly? I'm just passing it to the DefaultWindowProc.
The window creation problem I was having earlier turned out to be caused by a mistake in the WndProc, which in turn was probably caused by a mistake in one of Microsoft's message-cracker macros: the WndProc has to return a non-zero value (opposite of the DefaultWindowProc) on a WM_CREATE.
I'm starting now to set up the view-list hierarchy associated with BWindow and BView objects. It seems pretty straightforward, although defining the exact relationship between an HWND and a BWindow and its BViews will not be. It looks like HWND functionality is split across the BWindow and BView classes, so our wndproc will need to enqueue messages both for the BWindow associated with each HWND and for the BWindow's BView children.
Here is the scheme I'm thinking of using for tying BWindows and BViews to
HWNDs. Each BWindow will have an associated HWND, and so will its BViews. The
top BView's HWND will be the same as the BWindow's. All other BViews will be
associated with HWNDs that are children of the HWND for the BWindow and its top
BView.
We will store a BWindow/BView pointer in each HWNDs first
window-long so that wndprocs can easily find the BeOS
data associated with an HWND. To determine whether the HWND is associated with a
BWindow or a BView, the wndproc will have only to ask whether the HWND is a
child window. If not, the pointer is to a BWindow, through which the wndproc
will be able to get associated BViews. If so, the pointer is to a BView, through
which the wndproc will be able to get associated BWindows.
Changed my mind. Here's a scheme that will be simpler and that maps more directly to BeOS functionality. Every BWindow will have an HWND responsible for managing the window's non-client area, probably including the menu bar. We won't use the client area of the BWindow's HWND at all. Instead, we will create an HWND for each BView at the time that BView gets attached to the BWindow. This HWND will be a child of the HWND for the BWindow, and will have no non-client area. This gives us a simple one-to-one correspondence between BViews and the HWNDs we use for drawing, and between HWNDs and BWindows.
Just before I left last night I concluded that we would probably need to use different win32 class names for each BWindow the BeOSEmulation creates, but I neglected to write down the reason, and now I can't remember why I thought so. This is just a reminder to keep that possibility open.
Implemented all the ConvertToScreen/ConvertFromScreen functions for BView and BWindow as part of fixing the weird window size/placement behavior of the HelloWorld sample app. As near as I can tell, HelloWorld is working perfectly now, and has a reasonably good BWindow/BView design. Actually, now that I think of it, I'm not sure whether it's doing window_alignment (i.e., right/left/center placement of BView contents) correctly, but I don't think that's important right now. Next step is to find the simplest possible BeOS sample app that uses menus.
In setting up this new little MessageWorld project this morning, I had to figure out again how to set up the MSVC build environment for using BeEmulation, so I should write it down here even though it will probably change again fairly soon. It's pretty simple. In the Project Settings dialog, (1) add __INTEL__ to the "Preprocessor definitions" list, (2) check the "Ignore standard include paths" box, and (3) add "e:\GobeSource\BeEmulation,c:\bin\msvc6\vc98\include" (or whereever you keep the emulation and the VC98 include folder, respectively) to the "Additional include directories" list.
One more thing. To use BeEmulation in an MSVC project, you need also to add BeEmulation.lib to the list of "Object/library modules" and add "e:\GobeSource\BeEmulation\debug" (or where ever the BeEmulation.DLL happens to be) to the "Additional library path" list. Both are in the Project Settings dialog, Link panel, Input category.
Okay, another thing. Every project that uses BeEmulation needs to compile and link its own copy of BeEmulation/WinMain.cpp. This is just an implementation of WinMain() that turns around and calls main() after packaging up arguments, saving the HINSTANCE, and maybe a few other initialization things that will be common to all BeEmulation apps.
I think I'm close to having an early cut at menu-creation up and running in BeEmulation.dll. Right now I'm creating and maintaining parallel win32 and internal structures to represent the menu hierarchy, but it doesn't seem very clean; I'm still not convinced that the internal structures are really necessary, but maybe as I get farther into it I'll begin to see why. Added a bunch of basic BHandler and BLooper code today just to be able to see more clearly what else is needed. My guess is that, when I turn my attention to making menus actually communicate with the app, I'll have to get more of these classes implemented and working correctly.
Menu creation should be working now in this little MessageWorld sample app that I've been working on for the last couple of days, but for some reason, it's not. I thought at first that maybe the menu was just getting covered up by the BView hwnd, but that doesn't seem to be the case. It could still be that the initial size/position of the menubar is wrong, however, in any number of ways. I may also be overlooking some failure to get the menubar properly attached to the window. Still working on it.
Aha. I'll bet it's failure to properly handle menu-related messages in the wndproc.
Nope. Here's what the problem is. You can't use the win32 SetMenu to add a newly-created empty HMENU to a window and then go back and fill it up. My guess is that the window's client/non-client areas won't get resized correctly to make room for the menu if you try to set the menu while it's empty. So how am I going to make BeOS apps that aren't constrained by this ordering problem work correctly? I need to find some way to defer the win32 SetMenu until after the last BeOS AddItem.
Menu creation is mostly working now. Exceptions: (1) I'm not satisfied with how BSeparatorItems are being handled; (2) accelerators aren't working; (3) UTF8 characters (just ellipses so far as I know) aren't getting translated and are showing up as garbage in the menu; (4) items after the first one in the menu bar don't draw until you pass the mouse over them the first time. My guess is that we can't yet create hierarchicals, but this sample app (MessageWorld.cpp) doesn't test them, so I'm not really sure of that. I'll probably clean this up a little more before pressing on with getting communication from menus back to the app working right.
Okay, I have a better BMenuItem design now that cleans up the BSeparatorItem implementation. What it boils down to is just maintaining a win32 MENUITEMINFO structure in each BMenuItem that tracks its state in win32 terms. I'm going to wait until I understand BeOS keystroke handling a little better before trying to make accelerators work, and the problem with UTF8 characters is going to require some research: (1) would UTF8 handling be easier if this were a Unicode app? (2) what's the current MS line on support for Unicode/UTF8 support on Win98? So it's on to implementing menu-to-app communication. My guess is that it's going to be a win32 WM_COMMAND handler that constructs and enqueues BMessages, but I still don't have a clear idea about how to make a BHandler work smoothly.
Communication from menus back to the app seems to be working pretty well now. All the menus in the MessageWorld applet do what they're supposed to do -- File/New creates a new window, File/Close closes a window, and Options/Say-Hello toggles the message in the client area from "Hello World" to "Goodbye World" and back. It also maintains its checked/unchecked state correctly. I still need to work on the implementation of the Quit() function, but getting the basic implementation right won't be hard. What may be tricky is translating non-menu-generated win32 close events into BeOS close events, so that (for example) we won't have to #ifdef app code to know when the last app window has closed and it's time to exit.
I got just enough BMessage-send and -reply functionality implemented today to allow the MessageWorld applet to do its window counting correctly. I think the way that's working will probably hold up pretty well. I then tried to get all the window/view/app shutdown code working correctly, and wound up making a mess. I need to go back and tackle each of these pieces separately -- make sure the destructors for each one are correct and functioning before trying to get all of them to work together.
The crash on exit problems I was having last night turned out to have been caused by one clear mistake and one bizarre side effect of an unnecessary BList initialization. I plan to look some more at the BList side-effect when I get to a good stopping place just to make sure there isn't something worse lurking there. Right now I'm trying to get BLooper::PostMessage(B_QUIT_REQUESTED) to work correctly, and it's caused me to realize that BLooper is probably going to be more useful and important than I had thought. I've got the win32 message pump in BApplication::Run; it does pretty much the same thing as a BeOS BLooper, and I had thought that if BMessages could be dispatched directly from the win32 wndproc, maybe a full BLooper implementation wouldn't be necessary. But that's not true. We'll need it for BMessages that the app posts to itself. My thinking right now is that the win32 message pump will need also to check for BMessages in be_app->MessageQueue() and dispatch them to the correct BHandler. I'm still trying to figure out how to determine what the correct BHandler is.
Menu and message management is now in pretty good shape. The MessageWorld applet behaves exactly as it should, including the window-closing and quitting operations. Getting the Quit item's use of PostMessage to work correctly forced me to fix a problem in the BMessage copy constructor. Now moving on to the BasicButton applet.
The BasicButton
applet is now building and running, but not yet creating a button. In digging
into it, I found more startup and shutdown details that were missing from BeEmulation.
This applet does its window creation in its implementation of BApplication::ReadyToRun,
so I had to make sure that member function gets called at the right time. I'm
trying to implement things like this in more or less the same way that the Be
Book claims Be OS does it, so instead of just calling the app's ReadyToRun()
function somewhere in the init code, I enqueue a B_READY_TO_RUN BMessage for
handling in the general-purpose BMessage dispatcher that I've built into the
Windows message pump.
This applet also used a different technique for
posting the application B_QUIT_REQUESTED message that wasn't quite working until
about an hour ago. I think the interaction between handling of WM_DESTROY and
B_QUIT_REQUESTED messages is pretty much right now, but I need to go back and
make sure it still works with the other sample apps I've completed.
I've concluded that trying to use the BApplication class in any way before the app creates its own BApplication object is a mistake -- it introduces a lot more ugliness than it prevents. So I've changed WinMain to communicate the application's command-line and HINSTANCE to BeEmulation.dll using just a couple of simple setter/getter functions. I'm now making sure all these sample apps work okay with this simpler startup/shutdown code.
Yes, this is a much better way to do it. Much simpler and cleaner. All sample apps now start up and shut down using only the BApplication object that is actually created by the app itself -- we no longer have a separate one that is used within the DLL before the app creates its own.
The BasicButton is now creating a button, and creating it in the right position. You can click on the button, but it crashes because of code in the wndproc's WM_COMMAND handler that is too specific to menu controls. I'll have to figure a cleaner way of getting information from the BeOS object to the wndproc so that it can be enqueued for dispatching to a BHandler. The Windows message-cracker macro may be getting in the way here. I need to look at whether the WM_COMMAND LPARAM argument would be more generally useful (i.e., could always pass in a pointer to the sending BeOS object) if we weren't using the message cracker in this case.
I think I have a good, simple design for handling different kinds of
WM_COMMAND messages now: both the BButton and the BMenu classes are working with
it, and the BasicButton
sample applet is working correctly. The scheme is that, for every win32 object
we create (control, menu, window, etc.), we store a pointer back to the
corresponding BeOS
object in the GWL_USERDATA window-long. We won't use any other window words,
window longs, win32 atoms or whatnot to get at BeOS
info from the emulation layer -- just this one consistent scheme for associating
win32 and BeOS
objects.
I may go ahead and try to fix a problem with the button label
font before moving on from BasicButton.
I'd originally thought it might be a good place to get window resizing modes
working correctly, but this example just uses FOLLOW_LEFT and FOLLOW_TOP, which
is regular win32 behavior. I'll probably look for another example that
demonstrates resizing modes better.
I'm working on implementing BeOS window resizing, and just realized that I'm going to have to subclass every control that emulates a BControl. I'm trying to decide whether to use the regular BView wndproc for that, or to use a separate one -- maybe even a separate one for each subclassed control type, as we did with ClarisWorks. It shouldn't be too difficult to go either way at this point, so I think I'll try first to avoid too much separation. It made for kind of a mess with ClarisWorks.
Window resizing modes (B_FOLLOW_BOTTOM, B_FOLLOW_RIGHT, B_FOLLOW_ALL, etc.) are now working correctly, so next I'll move on to implementing other control types. I wound up having to separate BControlWndProc from BViewWndProc for subclassing purposes, but I don't think I'll have to have separate wndprocs for each control type. We'll know for sure within the next day or two.
I can't seem to find a BeOS sample app containing a real dialog, so I'm working now on a larger sample app that combines MessageWorld with BasicButton, and adds some or all of the other basic control types. While putting it together I noticed a little bug in MessageWorld -- the BView hwnd isn't handling WM_SETCURSOR messages properly. When the cursor changes into the resize arrows as it passes over the window edge, it doesn't change back to the regular system cursor when it's over the interior of the window. That shouldn't be too hard to fix -- just wanted to write it down so as not to forget it.
Today I took the first steps toward combining the MessageWorld and BasicButton sample apps into a new ControlWorld applet. I got it a version of MessageWorld with a button in it building and running under BeOS, and discovered that my implementations of BView::AddChild and BView::AttachedToWindow are incomplete -- they don't yet properly support nested BViews. I'm fixing that now -- the first version of the combined applet, with the nested BView (i.e., BButton that's a child of another BView) will probably be working by mid-day tomorrow.
I've finished making BView::AttachedToWindow recurse properly so that child views know when they're attached, and in the process resolved some Z-ordering problems between parent and child BViews -- children are now on top of parents, at least by default. So this little ControlWorld applet now appears to behave just like it does under BeOS. Under the hood, however, I have a feeling it's still not right, because the BButton's MessageReceived member function isn't getting called on button presses. Instead, the button-click message is being dispatched to the owning window's MessageReceived function. I need to add user-visible behavior to each control's BeOS message handling code so I can tell for sure I've gotten it right without having to spend time in the BeOS debugger.
I'm trying now to get the app's DialogNewDoc.cpp
to compile so that, as I work on getting the rest of the basic BeOS
control types to work, I'll be using real-world functioning code. It's a bit
harder than I expected. When I first saw how tangled up this module was with the
rest of the app, I thought I might as well go ahead and try to get the whole app
to compile. After about an hour of heading down that path I concluded that it
would be at least a week of work. It looks like the first step would be to
modify the makefile to support win32 builds -- an MSVC automake project would be
very tough to create and to maintain in parallel with makefiles for other
platforms.
So I came back to just DialogNewDoc.cpp,
made a little automake project just for it, and started adding BeOS
headers to BeEmulation
as required. My best guess is that I had to add between 50 and 100 before
finally getting a full preprocessor pass. Some of them look like headers that
probably aren't supposed to be included -- I may have header-search-path
problems that are causing the wrong versions of certain same-named headers to be
included, which would trigger a whole cascade of wrongly-included headers. And
then there is a worrisome problem with MSVC compiler error C2733: "Second C
linkage of overloaded function 'x' not allowed." It may require #ifdef'ing the
BeOS
version of string.h, at least -- I'll have to determine how widespread it is.
It looks like there are five Be headers that conflict with MS headers:
headers\be\interface\Control.h, headers\be\media\Buffer.h,
headers\be\support\Errors.h, headers\be\support\List.h, and
headers\be\support\String.h. Normally I would deal with this by making a rule
that within the app the include-file search path must always go first to Be
headers and then to MS headers. That may not work here, though, because Be's
String.h #includes the c-runtime string.h (differing only in case). Under
Windows we need to be using the MS version of the c-runtime string.h. I'm going
to experiment with fixing this by trying to add a path to the #include
It looks like I'm going to have to add quite a few new rules for configuring
the MSVC environment to build Gobe Productive -- enough to compound the
project-definition maintenance problem and make it essential to use makefiles
for production builds. I'm getting very close to being able to build the app's
DialogNewDoc.cpp
(after going down quite a few blind alleys), but at minimum it's going to
require adding all the Be header folders to the #include-file search path in
exactly the right order.
Here's something else. We need to add both
__INTEL__ and DEBUG to the list of predefined preprocessor symbols. (Microsoft
normally defines only _DEBUG for debug builds.)
So I've finally gotten all of the app's headers to compile under
win32. Or at least I've gotten all the ones #included by DialogNewDoc.cpp
to compile, and given the amount of header tangle in this codebase, I'd guess
that it's a fair percentage of the total. I've still got 48 link errors to
resolve before I'll be able really to start making this new-doc dialog work, but
they don't look too hard: just a bunch of BeOS
things that I haven't yet gotten around to implementing, plus all the TDialog
and TString member functions, and maybe a few others. While it's still
reasonably fresh in my memory, I need to write down what I've done to make all
this work, and a little about why I've done it.
First, I'm using the /FI
(Force Include) compiler command-line switch to effectively insert StdAfx.h
and BeEmulation.h
at the top of every .cpp file in the app. "StdAfx.h"
is an odd name, I know, but it's Microsoft's standard name for a header which
encapsulates the subset of the win32 API that you plan to use -- every Windows
programmer in the world will know exactly what this file is for. I've extended
its normal usage a little to include conflict resolution; everything we do to
resolve symbol name conflicts (both preprocessor and non-preprocessor) between
win32 and BeOS
headers is in StdAfx.h.
BeEmulation.h
defines all parts of the interface to BeEmulation.dll
that aren't defined by the regular BeOS
headers. It's not much at this point: just some debug stuff and a couple of
accessor functions for passing information into the DLL.
Having to make
the whole win32 API visible to every .cpp in the app is something I'd wanted to
avoid, but if we don't do that, it won't be practical to #ifdef the BeOS
headers to add win32 data, and that would make this emulation a lot more
difficult to write and maintain. I did try a couple of days ago to back out some
of the #ifdef'ing I've already added to the BeOS
headers so as to avoid this, but quickly saw that it would be much more
expensive than simply using macros in StdAfx.h
to resolve all the win32/BeOS
symbol name conflicts.
Second, I'm relying on an ordering of the
#include-file search path that lets me resolve file-name conflicts with a simple
set of rules. We always search in this order: (1) the BeEmulation
folder; (2) the app folder(s); (3) the BeOS
header folder(s); and (4) the win32 header folder. So, when we have to modify a
BeOS
header, we just put the modified copy in the BeEmulation
folder so that it overrides the unmodified copy. When both win32 and BeOS
use the same name (or names differing only in case) for different headers, we go
ahead and let both win32 and BeOS
pick up the BeOS
version (because of the search order), but it all works because before
#including the win32 headers we forcibly include the win32 versions of these
same-named headers in StdAfx.h
with a statement like this:
#include "../include/string.h"
... which guarantees that the win32 version (which always lives in a folder
named "include") is read instead of the BeOS
version (which never lives in a folder named "include").
I've also had
to fix a bunch of problems which triggered cascading inclusions of wrong header
versions by putting #includes of same-named but non-standard headers in #ifndef
_WINDOWS blocks. All of these were in the app itself, mostly in Headers.h. I had
to split the debug macros out of the app's Headers.h into a new header named DebugMacros.h
(which Headers.h now simply #includes) so that it could be included by the
version of Table.h in BeEmulation
-- I couldn't find any other way to fix all the redefinition errors you get if
you try to make Table.h compile both inside and outside the app.
There is a fairly good MS Knowledge Base article on "Unicode Support in Windows 95 and Windows 98" at http://support.microsoft.com/support/kb/articles/Q210/3/41.ASP. Support for UTF8 under NT4 (and presumably Windows 2000) is described at http://support.microsoft.com/support/kb/articles/q175/3/92.asp. From what I can tell, there just is no support for UTF8 under Win95/Win98.
It seems like I'm spending an awful lot of time just dummying up UNIMPLEMENTED() member functions for BeOS classes so that I can get get things to link and run in the debugger. I think now that if I had spent a few hours writing a little tool to do this early on, it would have saved days of work. I might try to whip up something in awk tonight; it might be doable in just a couple of hours.
Awk is such a handy little tool. It took me about an hour to write a little awk script that does 95% of the work required in generating "UNIMPLEMENTED()" stub functions -- it's already saved several hours in whittling down the unresolved external list for the tiny little part of the app I'm trying to get running (the Document Size dialog). Here is the full source for the script:
{ # Delete the words "static" and "virtual" anywhere they appear on the line. sub(/static/, ""); sub(/virtual/, ""); # Delete leading and trailing whitespace. sub(/^[ \t]+/, ""); sub(/[ \t]+$/, ""); # Delete trailing semicolon. sub(/;$/, ""); # Compress multiple tabs/spaces into a single space sub(/[\t ]+/, " "); # If we still have something on this line ... if (length($0) > 0) { # Parse out the member name and the return type. memberName = substr($0, match($0, /([A-Za-z0-9_\~]+)(\()/), RLENGTH); returnType = substr($0, match($0, /([A-Za-z0-9_\~]+)([ \t])([\*\&]*)/), RLENGTH); # Prefix member name with CLASSNAME:: -- you can replace it with the right class name in the editor. sub(/([A-Za-z0-9_\~]+)(\()/, "CLASSNAME::&"); print $0; print "{" # If the returnType is not "void" ... if (match(returnType, /void/) == 0) print "\t" returnType " result = 0;"; print "\tUNIMPLEMENTED(\"CLASSNAME::" memberName ")\");"; # If the returnType is not "void" ... if (match(returnType, /void/) == 0) print "\treturn result;" print "}\n\n"; } }
This little awk script has been a huge help; it lets me do in five minutes what used to take thirty or forty.
I've been struggling today with an irritating compile error: Microsoft's
"error C2666: '[]' : x overloads have similar conversions." Microsoft has a
poorly written and unusually defensive explanation of it at
http://support.microsoft.com/support/kb/articles/Q106/3/92.asp. It happens when
an operator is overloaded according to whether it is const or non-const, and the
compiler has to do its own conversion of an index argument. This seems to
trigger what seems to me a bug in its determination of which version of the
operator to use.
Anyway, here is the scheme I'm using to deal with
problems like this. First, make sure that, in both versions of the index
operator, its argument type is int-length so you won't have to add type casting
in all the places where the code is just using a constant with the index
operator to get a const result. In places where the code needs a non-const
result, you have to add type casts so as to defeat the implicit type conversion
which seems to trigger the bug. This gets you past the errors with the fewest
possible code changes.
I've checked all the BeEmulation sources into CVS under Phoenix/BeEmulation (note change of location -- it's now under the app) and am returning to the problem of getting dialog code in the app to build and work at a minimal level.
I've gotten everything checked into CVS now, and have verified several times that I can delete everything locally and then get back immediately into a fully productive state just by doing a full CVS checkout. I found it a little difficult at first to keep track of pending changes because CVS makes local sources writable by default -- I normally determine what I may have changed locally just by listing local sources that are writable. I tried changing the default by setting cvs_options=-r in an environment variable, but it doesn't seem to work. This, however, does work:
CVSREAD=yes
This makes all local sources initially read-only, which is a more familiar working model for me.
The whole app compiles and links under Windows now. Next task: getting it to build under Scott Lindsey's makefile system so that we can create a win32 build monitor.
I read through the GNU Make docs today. It looks like it will be good enough to support a reasonable win32 build system, and will allow large parts of our makefiles to be cross-platform. It's excruciatingly slow, however, so we may need also to maintain an automake version of the system just for efficiency.
GNU Make is starting to look extremely flakey. I spent at least an hour
yesterday trying to get the string substitution functions to build the
include-paths list for CFLAGS. Out of any five plausible ways of using any one
of these things, it appears to me that only one is likely to work. Brings back
memories of the endless sea of PolyMake
bugs -- not many crashers, but thousands of little things that just weren't very
robust during its early years.
I'm trying now to get a very simple but
very long dependency line to work. It's the list of all the .obj files that get
linked into the executable, and GNU Make is being very obtuse about it. This
needs to get better.
Okay. The makefile is working reasonably well now. The link step works okay, the browser database is getting updated, dependencies are automatically updated, useful info is left behind when the build fails but is all cleaned up when it succeeds, etc. And I've got a simple build monitor written, but not yet tested. This build monitor is a little smarter than Apple's in that it only sends mail to people who have touched files implicated in build errors.
I thought I'd make a quick note of this while I'm thinking about it. Pkware
is using an interesting banner-ad system described here:
http://www.pkware.com/shareware/sponsors.html
As much as I prefer makefiles over IDE-style project definitions, I have to admit that they are sometimes hard to deal with. I've just finished wasting at least two hours, probably more, trying to figure out why the Phoenix Makefile no longer works under Windows after Scott Lindsey's changes yesterday. The reason, it turns out, is that a variable named MACHINE, which we set at the top of the makefile, was getting silently reset to an unexpected value inside makefile-engine, the include for which is no longer ifdef'd out for Windows. I've changed all the win32 ifdef'ing over to use the OS variable, which should be more reliable, so things are working again, but it illustrates one of the downsides of using Makefiles.
Forgot about having to read resources that have the resource name tacked onto the end. Got that working today, and implemented LoadResource (which keeps resources in memory until the resource file is closed). Looks like next I have to get a minimal BPath implementation working. That's close; it shouldn't take more than another hour or so.
Did a very minimal BPath implementation this morning, and am now finally back into fleshing out the parts of the emulation that are most used in the app's dialog code. It looks like my next task will be to add support for finding views by name to my BView emulation.
I'm now into implementing BListView, and with Scott Lindsey's help just discovered an oversight in the code that maintains BWindow/BView hierarchies -- when one BView was attached to another BView which itself is already attached, the child BView's owner was not being set, although it's child BView's owners were being set correctly.
Still trying to get this New Doc Dialog working. I was amazed at how easy it was, actually, to get past all the resource and window/view heirarchy problems. That seems all to be working fine now (well, actually, I need to remember to get the new splash screen bitmap -- BMPm.101 -- from Scott). But the controls aren't visible in the dialog window, and I just figured out that it's because they're being created outside the window -- too high and too far to the left. I'm not yet sure why.
I have buttons showing up in the app's New Document dialog now. The main
problem turned out to be that I had not added support for B_FOLLOW_-type
tracking on resizes of BViews -- I only had it working for BWindows and
their children, so it worked one level down the hierarchy, but no further.
The buttons are blank, however, and are not correctly located within the
dialog window. Figuring out why we have those two problems is the next task on
my list.
Just finished writing a little launcher applet for the BeOS 5 install CD. It has a couple of logo bitmaps and configurable button text and button actions. It's called "Launcher." I should probably check its source into CVS, but I'm not sure where.
I fixed a big mess in the way that HWNDs corresponding to nested BViews and controls were being maintained, and it turned out to correct all the control positioning problems I've been chasing for the last few days. Next I'll be focusing on control labels -- the buttons in the New Document dialog are still blank; seems like it shouldn't be too hard to figure out why. It may also be related to why the listbox items are blank.
I figured out why the New Document dialog's buttons were coming up blank -- I was setting the button's window text from the view name instead of from BControl::Label(). In the sample applets that I'd gotten to work earlier, the same string was passed into both, but Phoenix seems to economize its use of view names. My guess is that essentially the same problem is causing the list box items to be blank. I'll chase that next.
I'm trying to get message dispatching out of the wndprocs so that BMessages are enqueued and ultimately dispatched in the same order that they would be under BeOS, and I'm having trouble now mainting BLooper/BHandler associations. I've got BHandlers floating around that have NIL fLooper members. I've corrected two or three problems in the last couple of hours that I thought were probably causing it, but it's still happening. Hmmm.
Okay, it looks like BHandler are now reliably getting associated with BLoopers at the right times, but I still have a problem with BViews that don't get their NextHandlers set, for some reason. It's probably not directly related to the immediate task, which is to get BMessage dispatching promoted up into the message loop so that it all happens in just one place and in the same way that it would under BeOS, but I'm still worried that there is something fundamental missing in my BView creation code.
I figured out why I was seeing NIL NextHandlers for certain BViews. It turned out to be the first view at any level in the hierarchy which had this problem -- I'd just overlooked an edge case in AddChild. Now that this is fixed, buttons in the app's New Document dialog are working, because the BMessage is now getting passed up the chain of BHandlers until it gets to the right one. The BMessage is still getting dispatched from the wndproc instead of from the top level win32 message pump, however. I plan to fix that today.
I'm having trouble with BListView, which probably shouldn't be surprising, but it's tantalizingly close. There are several problems now. One is that the WM_PAINT handler and the WM_DRAWITEMHANDLER are getting different DCs, with different fonts selected into them, so listbox text changes its appearance when items are selected. Another is that listbox text is being drawn too far down, by about the height of a title bar; of course that suggests a window/client-area confusion, but I can't find it. I'm also not getting proper adornments around the listbox -- that's probably just some mistake in the windowstyles I'm passing into CreateWindow. And of course the listbox isn't actually working yet because I haven't implemented SetSelectionMessage, but I don't think that's going to be very hard. The wrong-DC problem seems like the most worrisome one here, so that's what I'm working on now.
Well here's a win32 gotcha that really sucks. As near as I can tell, it's
something inside the win32 dialog manager, not in the LISTBOX control, that
causes an owner-draw listbox to get WM_DRAWITEM messages. So, if you're trying
to use an owner-drawn listbox outside of a regular dialog, you only get
WM_DRAWITEM messages (which seem to have the right DC for the item) when the
selection changes. The consequence is that you wind up drawing the item one way
when the listbox first appears, and another way when the user clicks on an item
inside it.
This may force me to go back and create real win32 dialogs
for BeOS
windows that appear from the look-and-feel flags to be dialogs. That might be
better long term -- I used to wish that we'd taken the time with ClarisWorks
to use dialog windows that win32 recognized as such -- but I'm going to
experiment with one more thing before biting that bullet. Maybe I can just
forget about handling WM_DRAWITEMs and do all listbox drawing in the WM_PAINT
handler.
How did we deal with this in ClarisWorks?
That approach turned out to work reasonably well, at least for drawing purposes, although the absence of a real win32 dialog may still have something to do with the remaining problems I'm having with listboxes. They do sort of work now, however. They draw okay, and you can click on a listbox item and it will ultimately get you to the right place in the code.
I just solved the problem I was having yesterday with win32 listboxes not getting LBN_DBLCLK messages -- I just needed to add an LBS_NOTIFY to the window styles passed into the win32 CreateWindow. So we've now got full functionality in the dialog's controls, although there are still some other problems with the dialog's general functionality. The only parts in the list, for example, are WP and GR. And we've got a bunch of aesthetic problems. We've got an assortment of off-by-one problems with the edging, for example, and even if they're fixed, the control will still look ugly and flat; it needs 3D edging. In fact, the whole dialog needs all the aesthetic stuff that we'd get for free using a real win32 dialog, so maybe I do need to investigate making that change.
I wrote a little code yesterday that traverses the BView hierarchy on FrameResizes
looking for clipped children. It exposed part of the window-sizing problem I've
been chasing for the last day or so. I had the code that adjusts BWindow
bounding rectangles in its FrameResized
function: a virtual. The app overrides it, so win32 and BeEmulation
were getting out of synch in their understanding of BWindow sizes and positions.
It didn't clear up the problem with the New Doc dialog getting sized 20
or 30 pixels too short, however. I'm still looking for that. I'm also seeing a
clear failure of the layout code to do enough horizontal resizing of a BListView
parent -- maybe the scrollview. I just realized that this may be because I'm
suppressing too much BScrollView
behavior; I probably at least need to let it get FrameMoved
and FrameResized
messages.
BBitmap is now implemented well enough that we can accurately draw all the bitmaps in the New Document dialog. Because the app relies on being able to write directly into the BBitmap.Bits() buffer, however, we're presently maintaining a copy of each bitmap's pixel data and updating win32's copy from it just in case the BeOS copy has been modified. This may cause performance/memory-consumption problems. Looks like the solution may be to use a DIBSection so we can draw directly into the DIB, but even a DIBSection doesn't let us treat the bitmap data just as an array of bytes. Need to do some more looking.
I'm now trying to get document windows to create themselves, and I'm alternating between crashing in menu code and crashing in a lot of initialization code that only started executing yesterday. I'd been skipping it until yesterday because it is meant to run in a separate thread, and something relatively uninteresting down inside it has been crashing for the last month or so. The document window menu code depends on having all that initialization done, however, so I'm getting it to work now. It's gotten all the way through several times, but I've discovered in the last hour that the reason is partly because I've got an emulation error related to ownership of dispatched/invoked BMessages. I just fixed the most obvious parts of that problem, so now I'm back to crashing in initialization code.
The app's debug dialog is now working, although it still has a bunch of graphic defects. I'm working now on the Document Settings dialog because it contains an edit field, which ought to be fairly easy to get working. I've now gotten it to create the win32 EDIT control and to draw more or less correctly. Should be able to wrap up support for communication with the control sometime tomorrow.
It turned out that, to do a reasonable job of implementing BTextControl, I needed to follow the BeOS model more closely than I'd expected to. I had to implement BTextView corresponding to the EDIT control, and another hwnd for the BTextControl which, at the moment, is a STATIC control with an EDIT control as a child -- unconventional, but it seems to work. I've still got a positioning problem with the EDIT control, and I don't have any communication yet between any of these new entities, but I can now see pretty clearly how the communication will need to work. I expect to have it all done sometime tomorrow.
I now have the BeOS doodle applet up and running under win32, with no changes to any of the doodle code proper. Had to fix a bunch of little emulation errors to do it, and had to turn on thread-spawning. It has a relatively complex menu and event-handling structure, and that's all working, but it's not doing much drawing right now. Before I try to figure out why, I'm going to fix support for creation of invisible windows to see if it improves the appearance of the app.
This doodle applet is turning out to contain lots of little challenges, but all useful, so far as I can tell. It exposed an error this morning in my implementation of BView::ResizeTo, and an error yesterday in my implementation of SetTargetForMenuItems (or whatever that function is called). It has a floater, so I had to get that kind of window creation working right, and that has improved the app's functionality. I'm only just now getting into its drawing functionality, however. I'm working now on getting the bitmaps in its floater to draw correctly. They now draw just as thin little black rectangles -- wrong colors, and either the wrong height or the wrong origin in the floater, but correct horizontal placement and width. My best guess is that the height problem is a client vs window coordinate confusion somewhere, since the bitmap rects are too short by about the height of the title bar. That's what I'm chasing now.
Here's where I am now on support for drawing modified bitmaps. It appears
that, when you create a 256-color DIB with a specified color table, Windows
reorganizes the color table for some reason -- probably to get system colors
into the first 16 (or is it 20?) slots. So, when you call GetDIBits
later on, the pixel (i.e., color-table index) values are different, and it gives
you the reordered color table.
So for the time being, at least, I'm just
maintain a copy of the original bit values. If this copy is available (see below
for the one case where it won't be), we always return it from BBitmap::Bits().
We synch up the win32 bitmap with this copy in two places: in SetBits,
and (if we detect that the BBitmap's bit values may have been modified via a
call to the BBitmap's Bits() function) in DrawBitmap.
This scheme has bitmaps in the BeOS
doodle applet drawing correctly now.
I'm planning to throw away our copy
anytime a BView is made a child of a BBitmap for purposes of drawing into the
BBitmap, but that code isn't written yet. We may be able to get away with this
long term, since there may never be cases where we need to directly modify a
BBitmap via its Bits() buffer after drawing into it via a child BView.
I'm moving on now to figure out while I'm getting strange LineTo
failures in this doodle applet, and I'll probably implement support for clipping
regions after that.
The strange LineTo failure turned out to be an easy one -- it was just an uninitialized PenSize() value. I'm now defaulting PenSize() to 1.0, as the Be Book says we should. I also found and fixed one of the things that been causing some of the window sizing problems in the app. It looks, however, like doodle isn't going to let me do anything meaningful with the clip region until I get some click handling in place, so I may need to go ahead and do at least some of that now. Could also implement BRegion and get it unit tested, however.
There is an article here that contains code for converting DDBs to DIBs -- looks like it might be useful at some point, but not immediately, so I'll just note it for later reference.
One of the mysteries I'm dealing with here is that, when I was doing the initial round of this bitmap work a couple of weeks ago, I couldn't make GetDIBits work with an HBITMAP created by CreateDIBitmap, but it ultimately did start working with an HBITMAP created by CreateCompatibleBitmap. That's exactly the opposite of what you would expect and what the SDK docs seem to imply about GetDIBits. So now I have a DDB that I should be able to draw into, and in fact can draw into if you judge only by the return codes of win32 drawing operations. But the bit buffer returned by GetDIBits isn't changed by any of these drawing operations. I've been tempted to experiment with a DIBSection, since that is what you're supposed to use for drawing into DIBs, but as near as I can tell (despite the odd success of GetDIBits), I'm not using a DIB.
I discovered yesterday afternoon that part of the problem with my
draw-into-bitmap code is that you can't get anything out of the bitmap while
it's still selected into a DC. Fixing that made it possible at least to get bits
out of the bitmap that I'd forced into it via SetDIBits.
Now I have a theory about why I'm not able to draw into the bitmap with
operations like LineTo,
FillRect,
etc. The more I read and pour over bitmap-related sample code, the more it
appears that only two operations are actually supported on the dc into which the
bitmap is selected: BitBlt
and StretchBlt.
My guess is that I have to create two memory DCs: one into which the bitmap will
be selected, and another into which I'll draw. After the drawing is done, I'll
try BitBlt'ing
from the second into the first. If that doesn't work, I'm putting this aside for
a while and moving on to BRegion.
Fixed a heap corruption today, as well as an error in maintenance of the BView/BWindow hierarchy that was happening when an attached BView was explicitly removed. I'm now trying to figure out why we always seem to create WP docs coming out of the New Document dialog -- I'd like to see whether a GR doc is drawing any better.
I finally got Graphics documents to create this afternoon. The problem(s) turned out to be mostly in BListView and BScrollView, and were primarily errors in BMessage creation and posting. I'm returning now to drawing problems.
Still can't figure out where all the bad area-fills are coming from in
document windows. I've gotten the client area of doodle to draw pretty well now
by adding code to BScrollView::AttachedToWindow
that undoes adjustments that client code has made for menubar and scrollbar
positioning -- those are outside the win32 client area, but inside the BeOS
client area. I had hoped that this work would at least fix the errors around the
edges of GP document windows, but no such luck.
I have some suspicions
that part of the problem has to do with bad clipping regions. I'm seeing an
awful lot of filling and drawing operations that aren't having any visible
effect on the screen. Most of them seem to be in offscreen drawing that isn't
controlled by the debug dialog's "Offscreen drawing" checkbox -- there seem to
be lots of places in the app where we should be respecting that flag but aren't.
But I don't think that's the whole explanation.
I think I'm going to put
this investigation aside for a little while and work on a better fix for the
problems we are having around the edges of document windows. Getting rid of the
scrollbars that win32 is creating automatically should be a good first step.
Finally got the splash screen to draw correctly. The problem turned out to be that I was using BView::Bounds within code that was trying to complete the BView::AddChild process (in BView::set_owner(), to be specific). I didn't realize until I traced through it carefully that BView::Bounds wants to return win32 client coordinates, which it can't do in the normal way until the BView's win32 hwnd has been created.
I've fixed another half dozen or so places where BView drawing operations
weren't respecting BView::HighColor
and BView::LowColor,
and seemingly as a result, the content area of graphics document windows are now
beginning to draw. I say seemingly because I also removed a FillRect
in the default implementation of BView::Draw that I now believe is unnecessary
and wrong, but that may also have contributed to this improvement. I've also
added code to convert BeOS
patterns into win32 brushes for use in drawing operations, but it's not getting
called, apparently because we are always using solid patterns in these simple
drawing cases, so I don't know whether it is working correctly yet -- can't
think of an easy, reliable way to unit test it.
This is a note to myself
that I need to make sure the CHECK_WIN32_CALL macro is working correctly; there
is something suspicious about the number of win32 api failures that have gone
away since I started using that macro.
Finally got content areas of WP documents to draw. The problem turned out to be entirely clipping-related, and boiled down to two errors. One was just a simple mistake in BRegion::Set that was forcing a lot of regions to have zero width. The other was a difference between the win32 and BeOS models for what the clipping region looks like when there is no clipping region: the Be OS GetClippingRegion returns a valid region that describes the whole window, minus a bunch of stuff, whereas win32's GetClipRgn returns 0, meaning that there just is no clip, even though drawing is still clipped at the edge of the window, at least.
Jeez Louise. Build problems up the wazoo today. This last one was
particularly frustrating. It turns out that parsedate.h, which I had to copy
over into BeEmulation/win32
from its original (and probably bogus) location in a BeOS
posix folder, had its #include
I've fixed the problem with document window elements (e.g., ribbons) being drawn too far down, by the height of the menu bar. I haven't yet checked it in, however, because I changed my mind about how to do it after talking to Holdaway. He had suggested adding an OSUtils.cpp to the app, containing a function named something like ClientAreaSpaceOccupiedByMenuBar(), which would return zero under Windows, and a standard menu bar height under BeOS. After looking at it, though, it seemed a lot cleaner to add a TMenuBar class to the app, paralleling the TMenu and TMenuItem that were already there. TMenuBar is derived from BMenuBar in a simple way, but adds one new method -- InContentArea -- that tells you whether the menu bar is part of the client area or not. This lets the code compute the menubar's impact on the content area's size in the same way it does now, with the one addition of asking this method whether the menubar is part of the content area at all. I think this will be better long term, because we may eventually want to implement detachable win32 menus (like MSVC has now) which may or may not occupy space in the client area, depending on whether they are attached.
After talking to Holdaway about it, I made a few changes in this new TMenuBar::InContentArea()
scheme last night, rebuilt the whole app, tested it, and then checked it all in,
thinking that I'd been very careful about it. This morning I found out that I
missed making the BMenuBar-to-TMenuBar
changes in Phoenix subfolders, and didn't see the problems because we're not yet
building external parts under Windows, so I broke the BeOS
build. Tom helped me get it straightened out.
So, before I plow ahead
with more work on win32 drawing, I think I need to get those parts building and
loading under Windows just so things like this don't happen again. I plan to
start just by getting the Clock part to build, since it's so simple. If that
turns out to be easy, I'll go ahead and do all the rest of the parts, get them
to load when the app launches, and maybe get them to run a little bit.
I got bitten by a ridiculous cluster of GNU Make bugs while trying to get the makefiles working again under Windows. The following six-line makefile fails because it doesn't know how to create api.dep from an api.c in the current directory:
%.dep : %.c @echo Making $@ from $< using a pattern rule. @touch $@ _targ: api.dep @echo Done.
Scott Lindsey verified that it works fine in GNU Make 3.77 under BeOS and Linux. I had been using the win32 version of GNU Make 3.75. So I went to http://www.sourceware.com/cygwin and downloaded 3.77, and was immediately bitten by a bug that makes me reluctant to continue using this thing: it has lost its ability to deal with CR/LF-delimited text files. See http://sourceware.cygnus.com/ml/cygwin/2000-06/msg00009.html.
So I'm back at work on the emulation, having given up temporarily on getting Parts to build. This morning's update of the BeEmulation sources put us in a state where the app couldn't get past the splash screen with multi-threading turned off, and we had a pretty consistent new crasher with it turned on, and the new crasher appeared to be related to some assertion failures that were being ignored. So I decided to go ahead and do something I've been meaning to do for several weeks: make the UNIMPLEMENTED macros use a simple OutputDebugString, and the ASSERT, ASSERTC, etc., macros use something which would optionally stop us in the debugger. Doing that forced me to clean up a half-dozen or so assertion failures, mostly resource related, that I've been ignoring for the last few months.
I got edit controls working again yesterday -- they had been completely broken for about a week -- and finally fixed the size/position problems we were having with them. There were actually three seperate errors all compounding the problem with these things -- failure of the BControlWndProc to use Bruce's new BView::em_change_size, failure of the BTextControl constructor to set correct B_FOLLOW_... flags, and failure of BTextControl::FrameResized to respect the BTextControl::FrameDivider() value.
I just finished fixing a problem with drawing ribbon elements by making the app code involved respect the app's offscreen-drawing switch. I talked to Scott Lindsey about it first; he couldn't think of any reason why drawing of UI stuff shouldn't respect this switch. The only reason it doesn't now, he said, is that it was originally meant only to control content-area drawing.
I made some progress yesterday on the drawing-mode problem that has been causing many ribbon icons to be completely gray or black. I implemented a form of FillRect that tries to respect the B_OP_MIN and B_OP_MAX drawing modes. It doesn't work very well, though, so I think Holdaway is going to change how the app lightens and darkens icons so we can just avoid this problem altogether.
I've been looking for the last half hour at why the document-scale indicator in the lower left corner of a document window is always black. The reason, it turns out, is our old problem with offscreen drawing into 8-bit bitmaps. I know that at least some forms of that were working at one point, but maybe this is a form that never worked, or maybe something got broken. I'm going to chase it for a little while, and then experiment with changing the bitmap to a 24-bit bitmap in the app.
I think I may be on to something here in trying to figure out what is wrong
with our offscreen drawing code. I wrote a CheckBitmap
function this morning so that I can begin to see what is in these bitmaps that
aren't drawing correctly. It turns out that the one I'm focusing on now (the
document-scale indicator) does have a good set of bits in it -- the failure
seems to be happening in DrawTransparentBitmap.
My current best guess is that it's a DIB with a bad color table. I'm going to
look around and see if there is some obvious place where I've failed to set the
color table for bitmaps with child views (i.e., into which we are doing draw
operations). If that doesn't turn up anything, I'll add code to CheckBitmap
that gets the color table.
The more I think about it, though, this can't
be a DIB -- I think the deal (as I've mentioned before) is just that SetDIBits
does work with DDBs -- so my guess is that I won't succeed in getting a color
table for it. But somewhere there has to be an explanation for why these
perfectly good-looking color index values in the bitmap produce an image that is
completely black. Maybe I need to dump out the palette for the DC into which it
is drawing, but I'll bet the problem is more likely in one of the offscreen DCs
that DrawTransparentBitmap
is using to accomplish transparency. I tried using a simpler drawing function --
DrawWin32Bitmap
-- but it couldn't select the bitmap into the source-DC that it created for the
bitblt; it failed with a GDI error that I didn't recognize that was out of our
error-string table's range.
Finally! I found and fixed the problem that has been causing offscreen drawing not to work. The problem was that, if the bitmap's child view (used to do drawing into a bitmap) still existed, it had an offscreen DC into which the bitmap was still selected, and that was preventing the bitmap from then being selected into a source-DC for a BitBlt operation. Fixed it by temporarily unselecting the bitmap from the child view's offscreen DC just for the duration of the bitmap draw. This gets drawing of UI elements for document windows into pretty good shape.
Holdaway suggested yesterday that I use the BView's drawing mode to determine whether to draw bitmaps with transparency support -- specifically, to check whether it is B_OP_OVER -- and that seems to work very well, with one exception. Bitmaps for which we are relying on the alpha channel to accomplish transparency are no longer drawing with any transparency, because for them, the drawing mode is B_OP_ALPHA. I guess I could just use DrawTransparentBitmap for both drawing modes, but it's not really a good solution -- I can make the New Doc dialog bitmap look better by making the white pixels transparent, but it still looks ugly. So, while all this bitmap stuff is still fresh in my mind, I think I'm going to spend a few hours trying to get alpha-channel support into our bitmap-drawing code.
I just noticed that we seem to have a pretty bad redraw problem with listboxes: the new-document-dialog listbox draws correctly when it first appears, but if you obscure it temporarily, it doesn't redraw at all. My current guess is that this has something to do with the BScrollView that "covers" the list box in some way that I don't completely remember. This is just a reminder to myself that I need to look into this.
I finally got the grid to draw correctly in graphics documents today. The
problem turned out to be twofold. First, the app code was counting on BView::StrokeLine
to turn on individual pixels when passed identical start/end points; my
implementation of StrokeLine
wasn't doing that until a few hours ago. The second problem was that BView::AddLine
wasn't respecting the rgb_color passed into it -- that was supposed to set the
BView's HighColor.
Now it does, although I just realized that I don't know whether it is supposed
to do it only temporarily; right now it does not restore the previous HighColor,
so it may have side effects that don't happen under BeOS.
I plan to move on to WM_SETFOCUS / WM_GETFOCUS handling tomorrow
morning.
I didn't quite get to the SetFocus/GetFocus work yesterday; wound up working instead on drawing problems in WP documents: text was drawing as a completely black rectangle. The problem turned out to be an incorrect definition of the B_TRANSPARENT_COLOR value -- I'd just set it to zero because the real value isn't documented. WP documents seemed to be storing it as 0x00777477, however, and Scott Lindsey wrote a one-line BeOS app that confirmed that this is indeed the right value. With that problem fixed, WP documents seem to be drawing pretty well now. In GR documents, however, WP frames with transparent backgrounds still draw completely black. Holdaway suggested figuring that out before moving on to the keyboard stuff, but I'm hoping it will be something simple.
Holdaway suggests adding a checkbox to the debug dialog labelled "Heap checking" that would turn on/off a call to CheckHeap() immediately after each message is processed. My guess is that we want to put it some place where it will be called on all platforms -- I'm not quite sure where that would be for BeOS. I'll check with Lindsey tomorrow.
I have keyboard input in WP documents sort of working now, although somewhere
I'm going to have to add a lot of code to make keyboard events look more like
what BeOS
is expecting. For example, when you press the Enter key under Windows, the
WM_CHAR that gets generated is a 0x0d -- a carriage return. Our app code thinks
it should be 0x0a instead, so I may just add a translation table for this kind
of thing.
I also started looking just a little while ago at
caret-drawing, which isn't working very well right now: you just kind of see a
flicker of a caret as you type. Here is where the caret-drawing action seems to
be: TWPPart::EraseCaret.
Keyboard input in WP documents is now somewhat functional. You can type, and
you can navigate through the document using the arrow keys. I think you can also
use the Insert and Delete keys, and maybe even PageUp
and PageDown,
but I haven't tested those yet.
I think I'm going to work next on
getting the WP caret to draw (i.e., flash) correctly, which will require a more
correct implementation of the BeOS
pulse stuff, but I wanted to make a quick note first about one remaining problem
that most likely has something to do with getting the initial focus set
correctly: until you press and release the Windows start key (or maybe it's just
until you activate and reactivate the app once) you get WM_SYSCHAR and
WM_SYSKEY... messages instead of WM_CHAR and WM_KEY... messages. Trying to force
focus to the right window via SetFocus()
doesn't work: the SetFocus()
fails with an ERROR_INVALID_MENU_HANDLE. I haven't been able to figure out why.
We now have correct Pulse() support for BViews. It turned out not to be as
simple as at first I thought it would be; lots of little nuances required to get
it right, and I did leave one part of it undone. Because it appears that the app
always wants Pulse messages ten times a second, I just set up the win32 timer to
generate WM_TIMER messages at that rate. This won't support changing the pulse
rate without some more work. Oh; and I also have not yet done anything to kill
the timer when the hwnd goes away. That may not strictly be necessary, but it
would probably be good form to do it.
This makes the WP caret visible
(finally), but it is still not flashing because I'm not yet supporting the
B_OP_INVERT draw mode. That's next.
It wound up taking a surprising amount of effort to get the WP cursor to
flash correctly, but I'm certain that all of the effort will pay off pretty
quickly as we work on other things. For example, the cursor code depended in
some way I can't remember on getting a correct result back from BWindow::IsActive
(probably because it only wants to flash the cursor in an active document), but
we didn't have any of the necessary WM_ACTIVATE handling; we do now. It depended
on support for B_OP_INVERT drawing mode in StrokeLine,
which we didn't have until early this afternoon. It depended on getting correct
results back from real_time_usecs, which until just a little while ago was
returning milliseconds instead of microseconds; it works correctly now. And of
course it depended on DispatchMessage
to recursively examine all BView children of a BWindow, and not only to dispatch
B_PULSE messages to those with the B_PULSE_NEEDED flag, but also to call the
Pulse() methods for these BViews directly.
This work also exposed an
error in our BMessenger implementation: attempts to remove messages (in this
case, B_PULSE messages) from an empty message queue were causing crashes, but it
was easy to fix.
I think I'm now going to get the beginnings of some
click-handling code into the wndprocs -- probably just placeholders for the time
being that will let us set breakpoints on mouse messages.
Made good progress on several different bitmap-related problems today, mostly
because, in trying to figure out what's wrong with my AlphaBlending
code, I stumbled upon a small library of routines for dealing with DIBs and
DIBSections in Petzold's book. They give us a convenient way to pass around
device-independent bitmaps that can be drawn into and directly manipulated
via a pointer into the bit buffer. I first tried it as a wholesale
replacement for the old bitmap drawing code; it took a couple of hours to get it
all to work, but turned out to be a lot slower than the old code, which
was using SetDIBits/GetDIBits
on device-dependent bitmaps and maintaining a copy of each bitmap for
manipulation through BBitmap::Bits(). So I made it runtime-switchable -- both
systems are in place now, and it is all working and checked in; 8-bit bitmaps
use DDBs, and everything else uses DIBs. This seems to solve the performance
problems; I can't see that anything is noticeably slower when we draw this way.
And I think it may be an acceptable solution to the problem of having to
maintain copies of bitmaps just to support access through BBitmap::Bits(), but
we'll have to see. The performance problems with 8-bit DIBs may be solvable by
converting them to DDBs after they are fully created.
There are some
lingering problems. Petzold's code uses IsBadReadPtr()
to determine whether an HBITMAP is really an HDIB, and when it's not, this
generates an Access Violation debug message. It's just a message because IsBadReadPtr()
temporarily suppresses GPF handling, but the amount of debug traffic it
generates is irritating. It shouldn't be too hard to come up with a better way
to identify an HDIB.
And AlphaBlending
is still not working. My theory had been that it wasn't working because our use
of DDBs meant that the bitmap's bits had been converted into a device-dependent
format that subtracted out alpha bytes, but that seems not to be the
explanation. Still haven't been able to find so much as one example of working
AlphaBlend
code.
Got the main parts of click handling to work yesterday. You can move the WP
cursor around now by clicking in a document's content area, and you can click on
at least some (maybe most) of the ribbon tools. Selected text is drawing
completely black, however; I'm hoping to figure that out today.
I'm also
not satisfied yet with how GetMouse()
is interacting with the mouse-message posting code in the wndprocs. It is still
completely timing dependent (i.e., it always returns the real-time state of the
mouse), even though the app is calling it in a way that should cause it to
return information about the last mouse message. It's working, though, so
maybe I shouldn't worry about it.
Here's a UI change I'd like to make in
the app, while I'm thinking about it. I don't think Windows users will like
having to relaunch the app after closing the last open document window. It seems
like the New Document dialog ought to reappear automatically when there are no
more document windows. Its Cancel button should be labelled "Exit" in this
situation.
Fixed the text selection problem just by making FillPolygon respect the current drawing mode. It doesn't look too great to me, however. Holdaway says he's in favor of adding a text-selection drawing mode that redraws in selection colors (as opposed to just doing a rectangle inversion), but before doing it, we need to find out exactly how anti-aliasing under Windows works -- whether it uses the actual background color, or just the default for the DC. Also need to know about this under Linux.
On Friday I managed to get WP frames in GR documents to draw out correctly. You can type in the frame, and text displays correctly (had been black on black when we first started seeing WP frames). You can't click out of the frame, however -- somewhere a list of BView children for a BWindow is not being maintained correctly, and you wind up dealing with pointers to deleted objects. That's what I was working on at the end of the day Friday, but I'm looking now at whether this is still the highest-priority item on the task list.
Partly at Holdaway's suggestion, I'm working now on getting a correct implementation of BWindow::UpdateIfNeeded(). It's turning out to be a little harder than I thought. It's going to require that we get the BeOS clipping region to match the win32 inval region during a Draw. Up until just a little while ago, we didn't even have BRegion::Intersects implemented, so we had no way of testing whether a BView was in the clipping/invalid region. That's working now, but the clip still isn't getting set correctly.
I just finished checking in some big changes in the way win32 display
contexts are managed for BViews. We now use a couple of accessor functions: GetViewDC
and SetViewDC.
The mWin32DisplayContext
member variable is private, and is now the only HDC member variable in
BView. I don't know for sure that it will work to have just this one HDC member
variable -- there may be situations, for example, where we make a BView that
already has a DC a child of a BBitmap for offscreen drawing purposes -- but
probably not, and this system is a great simplification.
We do still use
an AutoDC
object in BView.cpp, but its purpose now is just to get the DC into the right
drawing-mode and font-selection state for a BeOS
drawing operation, and then to restore the DC's state when it goes out of scope.
Modal dialogs are now truly modal. Getting it to work right required that we
identify the dialog BWindow's owner in InitBWindow,
and that we enable and disable the owner BWindow when the dialog is created and
destroyed. Added an mDialogOwner
member variable to BWindow for the latter purpose. Used GetForegroundWindow
for identification of the dialog BWindow's owner, which does create certain
timing problems when a dialog is created by a BWindow that is going away --
dealt with those just by making sure that the foreground HWND is one of ours and
is not another modal dialog or a floater.
Remaining problem: clicking on
the disabled window behind a modal dialog causes it and the dialog to trade the
activation state three or four times. Probably some side effect of mouse
handling code that needs to account for clicks in disabled windows.
I'm
looking now at why we are crashing when you try to bring up one of the app's
floaters ("Document History" or "Views & Layers").
I'm working now on getting the Document History floater to appear (and maybe later to actually work). It's crashing because of an empty table down in ListEngineer::Recalc somewhere. Yesterday afternoon I noticed that ListEngineer::Recalc is generating a lot of debug output that is apparently relevant to this problem, but it was using printf, assuming the output would appear in the launching console. That doesn't work under Windows, of course, so I changed all the app's direct printf calls so that they now use the BeOS PRINT() macro. It expands into a _debugPrintf function call, which should in turn call just a regular printf under BeOS, but which I implemented (using va_list and OutputDebugString) to write to the debugger console under Windows. It required a lot of header changes yesterday, since Debug.h had previously been #ifdef'd out of Windows builds completely. That is no longer true (which reduces header #ifdef'ing), and we now have a Debug.cpp which implements a bunch of things which may be useful for cross-platform debugging code.
I've determined that the problem (or at least the first problem) with these floaters has not much to do with floaters, but a lot to do with weaknesses in my BScrollView / BListView implementation. That's about all I know right now.
The problem with BListViews in floaters turned out to be fairly easy to solve: I had overlooked a couple of sentences in the BeBook which indicated that some of these list functions are supposed to return NIL (instead of ASSERTing) on out-of-range index values. So, as of Friday night, the floaters were in pretty good shape, including all the stuff we have to do to make the activate and deactivate in parallel with the app.
I wound up spending most of yesterday eliminating needless differences
between Linux and Win32 versions of the code that maintains BView/BWindow
hierarchies. It turned out not to be as easy as I expected, but it's all working
now, and it should help prevent more problems like the one we had yesterday
morning, where Linux-related changes in BView::AddChild
had unintended win32 consequences that prevented us from being able to run for
several hours. Most of the hierarchy maintenance code is now shared between
platforms, and it's both more simple and more clearly correct than it was
before. The AllAttached
hooks, for example, are now being called; we weren't doing that under Windows
before yesterday.
I'm now trying to figure out a couple of errors in the
placement of items in the floaters that just appeared for the first time on
Friday: content areas are shifted down too far, and certain label names aren't
working.
I've been working for the last couple of days on fixing various problems that
have appeared for the first time in two floaters: Document History, and Views
& Layers. All of the problems I've fixed so far have been fairly generic
emulation errors. Yesterday, for example, I found a mistake in BTextView::TextLength
that was causing it always to return zero; that was why the icons in Views &
Layers were being drawn at the left edge of the window. And I found that
BWindow::BeOSToWin32WindowStyles
wasn't respecting the BeOS
flags that are supposed to control whether a window is resizable, minimizable,
closable, etc. I fixed that just for the particular feature I'm working on now
-- resizability of these floaters -- but I'll need to remember to fix it for
some of these other Window flags.
Actually, now that I think of it, I'm
going to at least add UNIMPLEMENTED warnings for all the BeOS
window flag values that aren't yet supported, such as B_AVOID_FOCUS,
B_NOT_CLOSABLE, etc., so that we can at least see what we're missing.
Adding those UNIMPLEMENTED warnings for BeOS
window flags had very good effects. It showed that we were only missing three,
so I went ahead and implemented them, and in the process fixed a very
fundamental problem in our handling of WS_VISIBLE. I'd originally been creating
all windows visible by default and abusing the B_NOT_MOVABLE flag as a way of
making an exception in cases where I really needed windows to be created
initially invisible; I was having to do that because there is no B_NOT_VISIBLE
flag. The BeOS
way is to create all windows initially invisible, and to show them only when you
call BWindow::Show() -- that's the way the win32 emulation works now, too, and
it fixes a lot of ugliness during window creation.
When I made this
change, the app started crashing on document-window creation because DocViewPtr
was being called prematurely on an out-of-context call to BWindow::WindowActivated
-- the SetDocView
was being skipped because the window was now invisible during creation. I fixed
that by getting our handling of WM_ACTIVATE messages right -- we now enqueue
B_WINDOW_ACTIVATED messages for handling in DispatchMessage
instead of calling WindowActivated
directly from the wndproc -- and as a result, we no longer have the window
activation problem that has been requiring us to deactivate and reactivate newly
created document windows.
I'm close to having all the Graphics-document drawing tools working. This
morning I finally found the error that was causing so much garbage to be drawn
using the freehand tools: it was just a mistake in the data passed into a BShape
iterator (emulation level). I then found and fixed another error in BMessageQueue::RemoveMessage
-- it was only examining the first and last messages in the queue, so messages
in the middle still had pointers to them in the queue after being deleted. Right
now I'm trying to figure out why we are getting bogus NIL messages coming out of
the queue.
There is probably a lesson here somewhere about the
importance of using tested code when you can. My original implementation of BMessageQueue
used an STL queue, and probably had none of these very basic list management
problems. Scott Lindsey replaced it with a hand-rolled queue so as to reduce
code size and external dependencies (and maybe also to solve Linux build
problems -- I'm not sure about that), but it has been an expensive change. The
BMessageQueue
change logs show five separate fixes for this code already, and I'm probably
going to have a sixth within the next few hours. Each one of these fixes has
taken probably one or two hours on average.
Sure enough. Found and fixed another error in BMessageQueue::RemoveMessage -- the queue was getting into an incorrect state when we called BMessageQueue::RemoveMessage on the last message in the queue. This gets all of the graphics-document drawing tools into semi-functional condition, so I think I'm going to work for a while on getting parts to build.
I swear this GNU make is going to kill me, or at least make me tear our the rest of my hair. Consider the following three-line makefile:
_target: -@echo TEMP is $(TEMP) -@echo TEMP is "$(TEMP)"
Under GNU make 3.79, here is what it produces:
TEMP is D:DOCUME~1COLEBR~1LOCALS~1Temp TEMP is D:\DOCUME~1\COLEBR~1\LOCALS~1\Temp
So its rule apparently is that it will treat backslashes as quoting characters unless they appear in strings enclosed by double quotes. Note that it does its best not to give you any clue that this is what it is doing, because if you allow make to do its own echo of these commands (i.e., if you remove the '@' prefix from the operation lines) the echoed strings always include the backslashes.
Here is another abominable thing about GNU make, while I'm thinking about it. Macro names are case sensitive, which is not a bad thing in itself, but macros inherited from the environment must be written in all-caps, because this thing converts all environment variables to uppercase. This seems to be a change from earlier versions. Some change notes would certainly help.
Okay. Here is one change in the 3.79 version of win32 GNU make that I admit (begrudgingly) is good: they seem no longer to be exec'ing through COMSPEC all the time, and so they aren't always enforcing that stupied 1024-char limit on the length of command lines. Sometimes they still do, however; haven't figured out what the rule is on that.
God what an awful week this has been. Most everything is finally building under GNU make now, including all but two of the parts, but getting it all to work has been really tough.
It looks like I'm about to get everything building again this morning --
about half the parts were no longer compiling or linking, but the problems were
easy to fix. Only one build error in BeEmulation.
None in Phoenix.
I think my next step is going to be implementation of a
resource-handling scheme that we can actually ship with. We'll need it anyway
just to get parts working (unless Scott Lindsey did something about the problem
of conflicting "resources" folder names under the app last week) so we might as
well go ahead and deal with it now. Having the right resources bound into each
.exe and .dll will also save us some debugging grief over the next few months.
Parts are now sort of working -- spreadsheet documents can be created and are somewhat functional. We're running into heap corruptions in several places (bringing up the SS format dialog, for example), so I'm adding the dHEAPCHECKING flag to the debug array, as Holdaway suggested doing a month or two ago. It should make it a lot easier to find this kind of thing.
I just got bit for the second time by the strangest GNU Make bug yet: sometimes it fails to parse headers.depend because of something mysterious in the first 50 or so lines. It happened on Monday in one of the Parts makefiles, and I discovered I could make it go away by commenting out one perfectly innocuous dependency in the first dependency list. It just now happened in the headers.depend #included by the app's Makefile, where it may be more serious. It's looking more and more like I may eventually have to start fixing bugs in GNU Make.
It looks like we're going to have to depend on the C-runtime DLL: MSVCRT.DLL
(or MSVCRTD.DLL for debug versions of the app). The reason is that it looks like
it's going to be too messy to override malloc, free, etc., in DLLs that depend
on the app (such as the spreadsheet part) so that they are forced to use the
app's versions. I had started looking into doing this because the SS heap
corruption I was struggling with yesterday turns out almost certainly to have
been caused by the problem described at http://support.microsoft.com/support/kb/articles/Q190/7/99.ASP:
you can't allocate memory in one copy of the C-runtime and deallocate it in
another.
The only disadvantage connected with using MSVCRT.DLL is that
it adds a little installation complexity. It should reduce codesize with no
noticeable effect on speed, and it means I can get rid of the AppAllocator
and AppDeallocator
callbacks that BeEmulation.dll
was using to work around this problem. My guess is that it would be very
difficult to use the same kind of workaround in the spreadsheet part because of
how C++'s new and delete work. We can begin using the C-runtime DLL just by
changing a compiler switch, so I think it's clearly the way to go.
I'm
not yet ready to give up on the idea of reducing our installer dependencies to
zero, however -- I hate apps that don't work unless they are installed in just
the right way -- so I want to investigate a scheme for tacking all of our
critical DLL dependencies onto the end of the .exe so that we can split them out
if we don't find them on first launch.
The app is building again, now using libgobe and the C-runtime DLL. We can create Presentation documents for the first time. I'm thinking it may now be time to get translators building.
Tom, Scott Lindsey, Carl, and I talked on Friday about how to get grid menus
(see BMenu's B_ITEMS_IN_MATRIX) working under Windows and Linux, and ultimately
how to implement the menu-bar fill control. I'm actually not sure what it's
called; it's the one that has several pop-down grid menus for fill colors,
gradients, patterns, etc.
I don't think we reached a complete consensus
on what to do about the menu-bar fill control, but we agreed that simple grid
menus should be emulated using the BeOS
API in BeEmulation.
So I'm going to make a start on that before moving on to getting translators
building. I'll see if I can get the Shapes pop-up working first.
I'm working now on a mysterious failure to set bitmap bits correctly for the toolbar shapes popup. It seems to have something to do with copying one bitmap to another. My current theory is that I've got something wrong in a call to GetBitmapBits or SetBitmapBits -- I just found and fixed an error of this sort in a a debug-only function (CheckBitmap), so maybe there's another one somewhere else. If this doesn't pan out, I'll probably look next for DDB/DIB confusions.
This one is really hard to find. I'm about to test the theory that our
SetBits()
code is not working for bitmaps of different color depths -- i.e., that you
can't call one bitmap's Bits() function and pass the result to the SetBits()
of another one unless they have identical bit depths.
Along the way,
however, I've corrected several small errors in BBitmap, so the time hasn't been
completely wasted.
What a nightmare this bitmap stuff is. It seems like the more errors -- clear, unmistakeable errors -- that I fix, the worse they draw. But I've found one step that I think is definitely worth taking. Since DIBs are generally more manageable than DDBs, and in particular since they are friendlier to drawing operations now that we have this DIBSection thing, I'm going to start creating all BBitmaps as DIBs if the have a pixel depth greater than 8 or if they have the B_BITMAP_ACCEPTS_VIEWS flag. Plain old 8-bit bitmaps need to remain DDBs because DIBs are so much slower to draw.
Grid menus (i.e., BMenus with the B_ITEMS_IN_MATRIX flag set) are now working
at a minimum level under Windows. The main problem turned out to be in getting
correct implementations of BMenu::Show and BMenu::Hide. I had secondary problems
in RemoveChild
-- wound up completely rewriting it -- and in the BWindow constructors and
destructor -- we weren't deleting BWindows created offscreen for purposes of
drawing into bitmaps, and it was making it really hard to see other problems
where failure to delete BWindows was involved. That's all cleaned up now.
So now, I still need to fix the background color problem with these
bitmaps that are drawn offscreen, and I've got a few other little bugs and
aesthetic issues to deal with. And then it's on to the more complicated grid
menus -- fill color, fill pattern, etc. Those involve tear-off and a much more
complicated mouse-tracking system.
Here's an interesting factlet: 137 million people -- half the current number of people using the Internet -- have downloaded Macromedia's Shockwave Player so far. See http://www.zdnet.com/pcmag/stories/trends/0,7607,2617178,00.html.
It turned out that grid menus weren't working quite as well as I'd thought. My earlier #ifdef'ing-out of Tom's asserts related to BMenu window ownership was masking lots of little problems in the way AddChild and RemoveChild were working with BMenus. Pulled Scott Lindsey in to help with some of those; he suggested doing a RemoveChild at the end of BMenu::Track to disconnect a grid menu from its artificial BMenuBarWindow owner. That helped a lot, and so most of those problems are gone now at the BeOS level. Now, however, I have to find a way to deal with them at the win32 level -- have to be able to disconnect and reconnect a child window from its parent without otherwise changing its state.
Had good luck with the problem of disconnecting and then reconnecting a win32 child window from its parent: GetParent() and SetParent() do exactly what I need. I can now track through the fill control with all of its different kinds of menus appearing and disappearing, and with none of the mysterious win32 "invalid DC" errors I was getting before I had the parent/child disconnect/reconnect code working. Now I have to tackle some problems related to terminating and restarting the whole fill-control.
Okay. I think all the tracking and entry/exit problems with this god-awful
fill control are taken care of now. The entry/exit problems turned out to be
primarily timing-related. For example, if you clicked outside a tracked grid
menu at just the right time, the wndproc would see the click before the tracking
loop would see it, which meant that a win32 window activation event would
happen, causing the win32 menu to drop behind the document window. Dealt with
that just by making sure the BMenu::Track terminates on activation changes.
Now. If I could just get the damn things to draw ....
The god-awful fill color/ink control is finally working pretty well under Windows. It draws correctly (except for gradients -- need a StretchBlt implementation of DrawBitmap for that), and all the tracking entry/exit behavior seems to be correct. I'll probably get gradients drawing next just so it will be cosmetically complete -- I doubt that will take more than a few hours. Then I'll need to implement tear-off support. That will probably take a couple of days.
Yesterday I got all the translators building under Windows. I still need to modify upddlls.bat so that they get copied into the right location after a build, and of course ultimately I'd like for this step to be handled in the makefiles themselves (probably as part of building the "tree" targets that already exist). Before I move on to the next thing (probably the Chart part), however, I'm going to try to get zlib building under Windows. It's used by the Word translator, and although I worked around the link error just by #ifdef'ing out the call to zlib's uncompress() function, it reminded me that there are a couple of other places where I've done the same thing. So I'm digging around in the cygwin sources for it; I'll probably check the sources in once I get them incorporated into the win32 build process.
The Chart part is now building, but doesn't yet work. The problem seems to be
resource related, but I don't have any other clues about what it might be. I've
also got the libz compress() and uncompress() implemented, but punted on linking
in libz statically for the time being. Instead, we're depending on libz.dll
(gotten from ftp://ftp.freesoftware.com/pub/sourceware/cygwin/.
And I just learned a few minutes ago that we're actually depending on the
1.1.3-4 rev of the DLL; probably need to fix that.
At the moment,
however, I'm thinking about CVS branching and merging, and in particular about
creating a permanent branch in our CVS archives for code that isn't known to
build correctly on BeOS.
I'm thinking that Carl and I (and probably also Bruce, Tomy, and Scott Lindsey)
should normally work on a branch. We could call ours the Win32 branch, and
theirs the Linux branch. The main branch would be Cross-Platform, but would
effectively be the BeOS
branch. Those of us working on the Win32 or Linux branches would always merge
from the main branch as part of updating our local sources, but we would only
merge back into the Cross-Platform branch when we know that everything is
buildable under BeOS.
We'd have to verify that by rebooting and building the Win32/Linux branch under
BeOS.
Or maybe it would be better to have just one branch, probably called
"Port" or "Porting," so that Win32/Linux changes would normally be shared, but
not visible to developers working primarily under BeOS
until the Porting branch is actually built there.
Here's something I've needed for a long time -- just wanted to make a note of its location so I can get back to it as soon as I'm finished with the tasks that are currently on my plate: TestRec. It's a utility that will record and then play back keystrokes to another application, intended for making quick little test suites for GUI apps.
I did create a "porting" branch, and on Friday worked out the procedures for
maintaining it, but I've been in build hell ever since I switched over to
developing on the branch. I can build the app, but not all the parts and
translators, and the app can no longer create spreadsheet documents because of a
bizarre memory allocation failure that must be a side-effect of a build problem
(probably inconsistency between the app and some DLL). Every build error I've
tracked down has turned out to be real (i.e., the same error is in the HEAD
branch -- not unique to the porting branch).
So I tried to checkout and
build the HEAD branch on Gargoyle (my debug-target machine). For what appear to
be many different reasons, none of which I've been able to isolate completely, I
can't get anything to build on that machine. Just for example, in certain
contexts that don't appear to be unusual, GNU Make can't locate certain
executables on that machine unless you hard-code DOS-style paths to them. In
these same contexts, I've verified that the PATH is right. I'm using the same
version of GNU Make on both machines (v3.79), but other parts of the cygwin
distribution may be different. Unfortunately, I'm more certain that the cygwin
distribution on Gargoyle is correct and current, because I just downloaded and
installed it yesterday.
I'm still in build hell this morning. Here is the current problem. There are two Parts -- Spreadsheet and Presentation -- that depend on TBlockStreamWriter, which is part of Translators/TranslatorLib, and is not exported from the app. This is what I get when I link these Parts without linking in the TranslatorLib:
Linking obj_DEBUG/Presentation.dll .... Creating library obj_DEBUG/Presentation.lib and object obj_DEBUG/Presentation.exp PRExport.obj : unresolved external symbol "public: long __thiscall TBlockStreamWriter::Error(void)const " PRExport.obj : unresolved external symbol "public: long __thiscall TBlockStreamWriter::EndChunk(long)" PRExport.obj : unresolved external symbol "public: long __thiscall TBlockStreamWriter::WriteInt32(long,long)" PRExport.obj : unresolved external symbol "public: long __thiscall TBlockStreamWriter::BeginChunk(long)" obj_DEBUG/Presentation.dll : fatal error LNK1120: 4 unresolved externals
Linking TranslatorLib
into the Parts causes a huge cascade of other link problems. Linking in just the
one necessary object file from TranslatorLib
-- BlockStream
-- doesn't work because of inline'd member functions in TPoint (and elsewhere);
the inline'd functions cause duplicate symbol errors, since they are defined in
both BlockStream
and the app's export table. (Why they have to be defined in BlockStream
instead of just expanded inline, I don't remember.)
Yesterday I
experimented with exporting the TBlockStreamWriter
class so that these two Parts can get it from the app's export table. That
didn't work, but now I don't remember why, so I'm doing that experiment again.
It's an expensive test, because BlockStream.h
seems to be included just about everywhere, but it's about half done now.
Here is the reason that exporting TBlockStreamWriter
from the app doesn't work. It's kind of a tri-state problem. BlockStream.cpp
depends on TPoint, which lives in the app, which forces the Excel translator to
depend on the app's export table. The Excel translator also depends on a lot of
other non-exported stuff in BlockStream,
however, so you get duplicate-symbol errors on TBlockStreamWriter
when you try to link it with TranslationLib
and an app symbol table that exports TBlockStreamWriter.
Of course, translators aren't supposed to import from the app, so the
solution here is probably to go ahead and export TBlockStreamWriter
in TranslationLib,
meaning that it will be exported needlessly (but harmlessly) by translators, and
exported from the app for use by the Spreadsheet and Presentation parts. But
then where is BlockStream
supposed to get its TPoint symbols? Should TPoint be moved into libgobe?
Here's some late-breaking news. I've been trying to build the HEAD
branch on this machine (LYLE) while writing this -- it just failed trying to
build the Excel filter, which is additional confirmation, I think, that the
problem I'm trying to solve here is a real one: not a side effect of working on
the porting branch.
All those linking problems that have been giving me grief all week are
finally resolved. Several changes contributed to getting it all straightened
out. The most important is that libgobe.lib is now statically linked only
into libbe.dll under Windows; its symbols are then exported from libbe.dll
for use by everything else that needs it.
We also determined yesterday
that BeOS
builds no longer need to confine __declspec's to .cpp files -- we can now
declare imports and exports in the headers on both platforms, which eliminated a
lot of headaches in exporting libgobe symbols, and promises to make a lot of
other import/export issues much simpler.
I'm getting a lot of assertion failures in spreadsheet documents (most or all seem to involve maintenance of a keyboard-accelerator table), but if I ignore them, I can get a simple chart to appear. Still haven't succeeded in getting the chart dialog up, however, because of stack corruption crashers that I believe are resulting from calling methods in objects that have been statically cast to the wrong type. One of these was responsible for a similar problem I was having last week, although I don't remember the context now.
Charting seems to be working pretty well now, with the exception that 3D charts are blank because BGLView is entirely unimplemented. Bruce will probably be able to implement it with calls to lower-level OpenGL functions more efficiently than I can, so I think I'm going to leave charting as is for a while, and move on to ImageProcessing parts.
Have I mentioned that I hate GNU Make? It has taken me more than a day to get the app and all its components to build on my secondary machine -- this is with a set of makefiles that works fine on my primary machine. And I'm still struggling with parts of it. One of the most irritating things about is that its "-d" (debug) and "-p" (print symbol table) switches change the behavior of the program -- makefiles execute differently depending on whether these switches are set, which often makes them useless in debugging problems.
I'm beginning to doubt that this CVS 'porting' branch I've created is going
to be useful. I've determined that at least one of the problems I've been having
this week is rooted in a flaw in CVS's branching model. The flaw is that, when a
new file is created on the main (HEAD) branch after a side branch is already in
existence, merging the HEAD into the branch does not apply the branch label to
the newly added file. The file will be added to the local directory where you
did the merge, and CVS knows about the addition in an odd way -- you sometimes
get unusual messages from CVS updates in your local merge directory that refer
to these files as "new born" -- but if you try to checkout the whole branch in
another location, you won't get the newly added file.
This would
almost be tolerable if we had a reliable way of knowing when these files
are missing, and of getting the right versions of them when they are present.
But when I build the app with HttpClient.cpp
missing, make fails in the worst possible way -- quietly -- even though its
debug output makes clear that it knows the file is necessary and missing. The
debug output suggests that it thinks it has successfully built the targets that
are downstream from this file (i.e., the .dep and the .obj), even though it
clearly hasn't. I'm wasting too much time on this.
Tom just suggested something that may be the answer to the branching problem I was worried about yesterday. He said he thought that the porting branch should be recreated every time it is merged back into the HEAD branch. That's probably a good idea. I'm sure it would solve some of the problems with files that are added on the HEAD after the branch is created, just because we would know the branch would soon get repaired. Whether it would solve all of the problems, however, I don't know.
I've fixed all the crasher-level problems with creating ImageProcessing parts, although we still get some non-fatal assertion failures. ImageProcessing part documents are initially filled with black instead of white for some reason. That could be related to all the "Unimplemented function: BBitmap::BBitmap(B_BITMAP_CLEAR_TO_WHITE)" warnings we're getting as we run, although I've been assuming those were probably inaccurate warnings. I think my next step -- at least for the next few hours, while my mind is still in Makefile hell -- should be to get some or all of the plugins building. Then I'll probably hook up some very basic printing functionality -- maybe just the printer dialog to begin with -- so that I'll have a better understanding of the code and a better memory of win32 printing issues when we start working on printing in earnest next week.
Today I did the second complete merge of all porting-branch changes back into the HEAD branch, and it was quite a mess. I started it at 7:15 this morning, and didn't get completely finished until 12:00 or 12:30. While my memory of them is fresh, I thought I would list some of the problems I had. First, I learned (probably should have guessed) that files added to the branch are not automatically added to the HEAD when CVS merges the branch back into the HEAD. CVS knows that they have been added to the branch, so my guess is that there is some way of telling it to do this, but it wasn't obvious, and wasn't discussed in the Dr. Dobb's article I've been using to guide me through this process. Second, there seems not to be any way of telling the CVS branch-merging operation to use a particular tag on the branch as its merge base. As a result, every change that I merged into the HEAD in the first merge from porting was seen as a conflict, and had to be resolved manually. This is a good indication that Tom was right about how a system like this would have to be used, if it's usable at all -- the branch would have to be restarted after every merge from the branch to the head. (Whether you could safely merge individual files into the head while keeping most of your work on the branch, I don't know.) I did do that -- i.e., recreate the porting branch -- and wrote a batch file to do it, in case it turns out that we have to do this frequently:
cd \gobe foreachf "/ad *.*" -e cvs checkout -A $n foreachf "/ad *.*" -e cvs rtag -F BranchCreated-porting $n foreachf "/ad *.*" -e cvs rtag -b -r BranchCreated-porting porting $n
Of course, ever since creating this porting branch I've had the problem that
lines containing $Header: keywords are always seen as causing conflicts,
because CVS isn't smart enough to know that it changed those lines, not
me.
A few other miscellaneous problems. Doing a "CVS update -A" in a
folder, which should throw away all "sticky" tags and check out the HEAD branch,
gets undone when you later do a "CVS update -jporting" in the same folder. My
understanding was that this last command should only merge changes from the
porting branch into the branch onto which the current folder is checked out
(i.e., the HEAD branch, because of the -A). And yes, it does do the merge, but
because it re-associates the current folder with the sticky "porting" tag, you
can't just resolve conflicts and commit. Instead, you resolve conflicts,
try to commit, get a confusing error of some kind related to this sticky
tag, do a "cvs update -A" to get rid of it, then resolve all the same damn
conflicts again, so that you can finally do a commit that will begin to
succeed. But of course it won't succeed completely. CVS complains (wrongly, so
far as I can tell) that every file which did not require manual conflict
resolution contains an unresolved conflict. None of them actually do, so you
just have to touch all these files (better hope you were using CVSREAD=yes so
you can tell which ones they were), and then finally you can commit your
changes.
Of course, you'd be nuts to actually do any of the commits
mentioned above without diff'ing first to look at the changes you are about to
inflict on the rest of your team. Since most of the diffs are CVS's stupid
$Header:-line diffs that aren't really diffs at all, you have to page through
many tens of thousands of lines of meaningless stuff to see the comparatively
few changes that will actually be applied to the codebase. (CVS commit will
finally realize that files which differ only on $Header: lines are not really
different, and will not apply these changes). What you're trying to verify is
just that all the diffs in the list are actually changes that you intend to
make. Generally, you find at this point that some of them aren't, because eyes
tend to get a little blurry after the first half hour of manual conflict
resolution. It took me three or four passes through this whole process this
morning before I was satisfied that everything in the diff listing was
really something I wanted to check in.
I'm starting work on printing this morning. Everything seems to be building well enough, except for image-processing plugins and the new JPEG filter. I think we can leave those alone for at least a few days. When printing is at an alpha level, I'll come back and get those things building, and probably implement our longer-term resource manager at about the same time.
I have drop-launch working now. Parsing the WinMain
command line into argc/argv for a C-style main() turned out to require code from
Microsoft's C-runtime (parse_cmdline) that they've made static (non-public) for
some reason. I had to copy it out of stdargv.c into our WinMain.cpp
and massage it a little for build purposes; it certainly would have been easier
if they'd just made it public. Using the real C-runtime command-line parser
gives us better handling for quoted arguments than I could probably have managed
with something homegrown.
I still have to get the /p and /p:printer_name
switches working for dropping documents onto printer icons. Exactly how we'll
accomplish that without displaying the splash screen, putting up a document
window, etc., I don't know yet.
Printing is starting to work now, but I just discovered something interesting that is going to require some adjustment. I've been trying to use Adobe's PDF Writer as my print driver for test purposes just because it's easier than getting up and walking down the hall to the printer each time I do a test. It turns out, however, that the fairly low-level test I was using to implement BView::IsPrinting doesn't work for this driver. I was just doing this:
int technology = GetDeviceCaps(dc, TECHNOLOGY);
... and then testing the result to see if it looked like a printer, which
boiled down just just to making sure it was something other than DT_RASDISPLAY.
But that's exactly what this Adobe PDF Writer returns as its technology! I guess
they want apps to think that their driver has all the capabilities of a
full-screen display.
Anyway. It just means I have to use some other
technique for determining whether a BView is a print view. That's probably
something I could just as easily do with a high level flag set and unset at the
beginning and end of a print job.
This morning I added support for proper print scaling, and discovered that our display code needs to have its hard-coded 72 DPI dependencies eliminated. Carl is going to do that, while I move on to other things. I still need to do a Cancel dialog so that users can cancel a long print job. Or at least, I think I need to do that -- I don't remember seeing one of those dialogs recently. Maybe I should check the Official Windows UI Guidelines to make sure it's still standard UI.
I'm not so sure now that we can afford to change the app's assumption that
the display resolution is always 72 dots per inch. It's certainly the wrong
assumption for Windows, where a "logical inch" has either 96 or 120 pixels,
depending on how the user has set the "large fonts" switch in the system control
panel. Even under Windows, however, a 72-point font is supposed to be one inch
high when it is printed, regardless of the printer resolution, because a font
"point" is a typesetter's concept. Scott Lindsey says the app tends to mix up
the concepts of points and pixels, and that it may be too hard to separate them.
So for the time being, I'm going to continue scaling from 72 dpi to printer
resolution.
What I'm most worried about now is that, to print at the
right scale, we have to change the document window's scale for the
duration of the printing operation, which means we can't redraw it while we're
printing. We had the same problem in ClarisWorks.
It was expensive.
Anyway. I'm still working this morning on the "Cancel
printing" dialog. Since it's simple, and since we may have other places where we
need a platform-specific "Cancel current operation" dialog, I'm constructing the
dialog dynamically instead of using a resource.
I have the PNG translator building now, as well as the libpng.lib and
zlib.lib libraries that it depends on. I took a few shortcuts in getting these
libraries to build. The png library came with an msvc folder that contained .dsp
(MS Visual C++ project definition) files for both libraries, and they turned out
to work fairly well once I added zlib source code in the place they expected
(which, fortunately, is a sensible location for us: /gobe/develop/zlib). I did
not, however, write a GNU BeOS-style
makefile for them. Instead, I added lines to our MakeAll.bat
that use msdev.exe's command-line support for building the .dsp files directly.
It's kludgy, but I think it will work well enough for our immediate purposes.
I'm moving on now to the prefs library -- libprefs -- so that we can
finally begin to make some of the app's state information persistent.
The prefs library is building now, and we have USES_LIBPREFS turned on in the
app, so we have some hope of making prefs and other settings-type things
persistent sometime soon. It doesn't seem to be working at the moment, however.
I don't know why not.
I also got the TIFF translator building yesterday.
It crashed on my first test of it. I didn't see any obvious explanation in the
call stack, but my guess is that it has something to do with shortcuts I took in
getting it to build. The main problem was that the TIFF library in
/gobe/develop/libtiff tests the standard WIN32 symbol and uses it to change at
least one part of its interface in a way I didn't fully understand. It only
affected the type of a pointer-length function argument named "client_data"
however, so I figured the type might not matter to the library itself, and used
casting to get the translator to build. That may not have been the right thing
to do.
Today needs to be a bug fixing day, probably centered around
bitmaps. I'm going to try at least to get rid of the thousands of runtime
"BBitmap::CLEAR_TO_WHITE not implemented" warnings that we get in the debugger.
I've spent most of today repairing stuff that has gotten broken in one way or another while I've been looking the other way over the last few weeks. The color/pattern/gradient-fill control, in particular, is working again. I also fixed a long-standing problem with bad display contexts that was happening on second appearances of any of those popups. Still have a heap corruption problem in the gradient popup, though.
I'm having extremely frustrating problems with CVS this morning. At first I
thought it was something broken in the version of CVS that is now being
distributed with the full cygwin distribution -- this is the version that
identifies itself as "Concurrent Versions System (CVS) 1.10.8
(client/server)" -- and I do still think there is something wrong with that
version, but that seems not to be the whole explanation. The problem is that,
when I use the cvs "checkout" command to get sources on my secondary machine --
which I generally do before updating my primary build machine to avoid getting
completely shut down by errors in new changes -- it sometimes fails to create
the CVS directory that is supposed to reside in each source folder. When this
happens, you get really confusing failures from subsequent CVS operations. It
appears to be the source of the "x is in the way; please move it aside" message
from CVS update operations that has been puzzling me for months. It completely
prevents you from doing CVS edit operations. It's made worse by the fact that,
when this failure happens, you still get all the folder's source files, so if
you aren't paying attention to the output of the cvs checkout command, you don't
know that it has happened until a few hours or days later when things just don't
seem to be working anymore.
It looks like it may be a timing dependent
bug, because if I just try the cvs checkout again (or sometimes, try it again a
few times) it works. It doesn't seem to matter whether I use the 1.10.8
(client/server) or 1.10.5 (client) version of CVS; both will eventually succeed
if I just try it a few times, although it seems to help if I do each folder
manually; doing them one right after the other in some automated way seems to
make it less likely that all of them will succeed. I'm adding "sleep 5" between
folders in my GetClean
batch file just to see if it helps.
Here is a foreachf command that
checks for folders that have this problem:
foreachf "/ad *.*" -s -x"/ad CVS obj obj_DEBUG" if not exist $r$n\cvs echo ERROR: "cvs" folder not created in directory $r$n.
Image-Processing documents are sort of working now. Empty ones open up with a
white background (they were black until yesterday afternoon), and as of this
morning, we can open up ImageProcessing
documents created under BeOS.
So I'm going to put them aside for a little while and return to getting us set
up for profiling.
One note about image processing, however, before I
forget it. The reason they were initially black is that they use monochrome
bitmaps, which under BeOS
use 0 bits to represent white pixels, and 1 bits to represent black pixels.
Windows uses the opposite convention for monochrome bitmaps (which, if you think
about it, makes more sense, given that RGB 0-0-0 is black), so it looks like
under Windows we are going to have to make all of our low-level drawing code
reverse black and white when the destination is a monochrome bitmap. It's
probably something that can be done somewhere in the AutoBrush
class, but I'm not going to worry about it until I have an example of something
that is not working because of our current handling of monochrome bitmaps.
Getting set up to use the MSVC profiler has been difficult. There are at least a half dozen different profiling tools that come with Visual C++, the platform SDK, the Windows Resource Kit, and/or Windows 2000 itself, and I couldn't remember at first which one I had used successfully in the past. So I spent an hour or two looking for CAP -- the Call Attributed Profiler -- that looks pretty powerful, but finally figured out (I think) that you only get it with an MSDN subscription. So now I'm trying to get PREP, PROFILE, and PLIST to work; I'm pretty sure these are the tools I've used before, now that I look at them more closely. The documentation for them is here. At the moment, however, the app crashes when I run it under profile.exe, but not outside it, and there are no debug symbols for the app itself when it crashes. There are debug symbols for the two or three DLLs in the call chain, but the information they provide is very suspicious -- source doesn't seem to match the assembler at all.
I got the new BoundsChecker installed yesterday and fixed the first half-dozen or so of the errors it found; I tried to clear out the ones it found during launch and creation of a blank graphics document just so Carl and I wouldn't immediately be trying to fix all the same errors. I stopped on an error that is going to be harder to fix, but that is related to the work on tear-off menus that I need to do today. It identified places where we are treating HMENUs as HWNDs by mistake. We're getting win32 API failures that cause confusing downstream errors every time we do that, so I need to get that cleaned up before doing a lot more on tear-off menus.
I think I'm back now to a point where I can focus on tear-off menus again. I
spent most of yesterday morning working on a change in our build process that
requires BeOS
resources to be unzipped under each obj_DEBUG directory instead of under the
app, and I spent yesterday afternoon trying to get a debuggable release build up
and running so that I could have Tom look at a release-build-only crasher in the
HTML translator while he was here. I want to find out whether the god-awful
performance we're seeing in this translator is unique to the debug build, but we
can't know for sure if the release build is going to crash every time. I did
finally get the release-build problems resolved after he left. They turned out
the be caused in part by mixing debug and non-debug libraries, which was
happening because some of the makefiles still copy libraries up out of the
appropriate obj_ directories into the parent directories, so client makefiles
just get whichever version of them was built last.
That's all straighted
out now, but I still haven't succeeded in months at getting a complete
build at home. Last night I thought for sure it would work, but I was shut down
by a mysterious "bad address" failure from GNU's bison utility. It's always
something.
I started work yesterday afternoon on drag rectangles to work during menu tear-off -- couldn't find any information about the recommended way to do it, so I decided to start with a WS_EX_TRANSPARENT window. Poking around a little more this morning, I finally found an MSDN Knowledge Base article that seems to confirm that choice. I'm not sure yet how clean it's going to be, however. I'm a little worried that creating a transparent window is going to cause activation side effects, but I guess those tend to be fairly easy problems to solve.
I finally got everything to build successfully at home this weekend just by
making the intended output of that bison command -- parsedate.cpp -- by hand.
Had a bit of a scare when it crashed on launch, but then realized that the
machine I was running on has always had a problem with crashes in its printer
drivers when its default printer (which is normally remote) is not connected. So
I checked and, sure enough, other apps were also crashing when I tried to print.
We crash on launch in this situation because of our perverse desire always to
know the printable area of the default printer. It makes us look worse to crash
on launch, however, and of course looking at the current default printer to
determine page sizes it is getting hopelessly old fashioned. But I guess that's
the least of our worries now.
Today I'm changing course on the drag-rect
code. I'm going to use a transparent non-moveable window overlaid on top of the
whole screen. Should be simpler and more reliable than trying to let Windows
move a transparent window around and drawing the drag rect at its edges.
We're now doing a passable job of drawing the drag rectangle for tear-off
menus. The worst current problem is a timing dependency: in a BeOS
mouse-tracking loop (which is what the app is using to make these drag
rectangles work), if you move the mouse continuously, we just keep filling up
and dispatching messages from the win32 message queue -- we don't get back up to
the BeOS
mouse-tracking loop unless you slow down enough to let the win32 message queue
empty. I'm going to try to fix it with a new version of PumpPendingWindowMessages(),
probably called something like PumpSpecifiedWindowMessages(),
which will return control to the BeOS
tracking loop on a more predictable basis, maybe once every five or ten
messages, or after processing every message of a specified kind.
I'm
also having trouble preventing creation of a taskbar button for the transparent
full-screen window that I'm using for drawing the drag rectangle: I may have to
use the invisible-parent technique described at http://msdn.microsoft.com/library/psdk/shellcc/shell/Shell_Int/Taskbar.htm.
Tear-off of matrix menus is now in pretty good shape, but I'm having a hell
of a time getting it to work for regular win32 menus: creating another window
while TrackPopupMenuEx()
is running seems to be poor form, so I'm reorganizing a little so as to
experiment with creating the transparent drag-rect window before the menu
pops up. Maybe then it will be possible to draw on it in the middle of TrackPopupMenuEx().
Had to touch a globally included header (View.h) to do this
reorganization, and while waiting on the resulting build, found some interesting
statistics on
monitor resolutions and color depths currently in use.
I finally got real win32 menus to tear off reliably, but it wasn't easy. It
did turn out to be necessary to create the transparent full-screen drag-rect
window before popping up the win32 menu, but with that done, simple
drawing on the drag-rect window in the middle of TrackPopupMenuEx()
turned out not to be a problem. For some reason I couldn't use SetCapture(),
however, to funnel mouse events to the drag-rect window when the user tries to
drag during a TrackPopupMenuEx(),
so I had to go find where the WM_MOUSEMOVE messages were really going and force
them back to the drag-rect window. Modifications to the app code were all
related to faking out its assumptions about separate threads, but were really
not too bad -- maybe a half dozen lines or so changed.
There are three
remaining problems that I know about right now: (1) clicking back into the MICN
("mini-icon") menu bar while a menu is popped up causes a nasty-looking
assertion failure, but it looks for all the world like an app error to me -- I
haven't been able to figure out why it doesn't happen under BeOS;
(2) it's possible to get into a state where an extra click is required to
terminate tracking in the win32 menu, but I'm not sure exactly what triggers it;
and (3) RemoveChild
is doing the wrong thing for pop-up win32 menus that are not children of a BMenuBar
-- the Transparency button on the Graphics-document button ribbon is an example;
it removes the document window's menu instead of removing the popup menu
from its parent.
Today I took a break from working on tear-off menus and did the makefile work
necessary to start adding the Windows-specific resources we will need: icons and
version resources, mostly, I think, but probably other things too. I also
repaired PromoteDebugBuild.bat
-- it had been broken by our move from resources in a .res folder to real win32
.res files linked into the binaries -- made it capable of promoting both debug
and release builds, renamed it PromotePhoenixBuild.bat,
and checked it in. Doing this reminded me that we have some .dll dependencies
that aren't dealt with by this promotion batch file -- I know we depend at least
on MSVCRT.DLL, and it seems like I discovered one or two others in trying to
make this thing run at home last weekend. I think I'm going to reboot under
Win98 and see if I can spot some of those, maybe also fix any Win98-support
problems that may have crept in recently.
While I'm thinking about it,
here is the URL for instructions on Profiling
Multiple .DLL and .EXE Files.
I'm working now on getting the app robust enough to run outside these very
stable environments presented by the three machines Carl and I use for daily
development. On my Windows 2000 home machine, for example, we don't draw
anything in the content area of a document because a certain frame rectangle --
the "main frame" as opposed to the "root frame" -- is invalid. We fill the
content area with light gray using the root frame, but nothing else draws. I
can't reproduce that problem under Windows 2000 here at work.
We have a
nasty set of problems running under Win98, however, that I can reproduce
here. We crash with a stack corruption anytime the original system wndproc for a
subclassed edit control is called, and we're having a problem with
getting the original system wndproc reliably for BViews -- it looks like
we are sometimes getting it from freed memory. This problem is sporadic and
apparently timing dependent, but it happens frequently enough to seriously
interfere with chasing other problems, so I need to figure it out right away. I
was convinced a little while ago that it was happening because HMENUs were
getting mixed up with HWNDs, but my suspicion now is just that the win32 IsMenu
function doesn't return a reliable true or false when you pass it a destroyed
HWND under Win98 -- my current best guess is that we are sometimes getting back
false positives from it under Win98, but not under Win2000. I'm going to add
some asserts and chase it under Win2000 for a while to test that.
Here is a "sample application that uses the AlphaBlend function to produce a transparent splash screen." Looks like it might come in handy when I turn my attention back to bitmap drawing problems. It includes comments from people who have apparently gotten it to work on Win95 and NT4 in addition to Win98 and Win2000.
It turns out that the win32 IsMenu
function just doesn't work as a way of distinguishing HWNDs from HMENUs under
Win98. It returns TRUE for windows that are clearly not menus, although I don't
seem to be able to catch it with an assert immediately after window creation; it
only begins to return false positives after some other triggering event that I
haven't yet identified. Maybe IsMenu()
works as an equivalent of IsWindow
for HMENUs: i.e, capable of telling you whether the menu has been destroyed or
is not an HWND at any level, but not capable of distinguishing between
kinds of HWNDs.
Who knows. The answer for us is going to have to
be a simple separation of HWNDs and HMENUs, and no reliance on IsMenu().
I'm adding another member variable for the latter to the BView class and
straightening out all the code that assumes the mWin32Hwnd member could be an
HMENU.
Rats! Even with all our HWND/HMENU confusion eliminated, we are still drawing most of the PartBar icons directly onto the Win98 desktop instead of onto the PartBar itself. This is one hell of a hard problem to track. It's timing dependent, for one thing; if you step through the events leading up to it in the debugger, it doesn't always happen. I do have an assert now that reliably catches one of the error conditions associated with it. The assert uses the Win32 WindowFromDC function to find HWNDs in the BWindow/BView hierarchy that are associated with the wrong mDisplayContext. It shouldn't take too long to track it down now.
I know now that the Win98-specific problem we are having with drawing certain icons directly onto the Win98 desktop is caused by massive HDC leakage somewhere, so I'm in the process of eliminating most of our calls to GetViewDC, which can allocate a DC that the caller may or may not free. We need to use the AutoDC class instead, which takes care of freeing the DC when it goes out of scope, if that is appropriate (i.e., if it's not a DC that we got from BeginPaint.
Finally! The Win98-specific problem with drawing icons on the desktop is gone at last. The HDC leakage was happening in DimBitmap. We were doing a GetDC(0) for every pixel in the bitmap (dumb, yes, but these bitmaps that require dimming are relatively small) and then failing to release it -- not a good thing anywhere, but especially not under Win98.
I fixed a couple of other crippling problems yesterday. The stack corruption associated with subclassed EDIT controls that I mentioned on Monday turned out to be caused by failure to use CallWindowProc in forwarding unhandled messages -- we were just calling the original window proc directly. And the problem with document content areas completely failing to draw turned out to be caused by failure to get paper-size information correctly from the printer's DEVMODE structure. It reminds me that this dependence on the current default printer for document sizes is something I've grown to dislike more and more about ClarisWorks, AppleWorks, and now GP. Most documents these days are not created with any intention of printing them on the current default printer. Why document page sizes have to be constrained to that is beyond me.
I forgot to mention that I finally got Microosoft's AlphaBlend
function to work on Friday. The missing piece was just one value -- AC_SRC_ALPHA
-- that needed to be plugged into the BLENDFUNCTION.AlphaFormat
field. It was ridiculously hard to find, because the AC_SRC_ALPHA identifier
does not appear anywhere in the headers we are using, and is also not in any of
the headers you can download from Microsoft's web site as part of the Win2000
Platform SDK. I had to search the whole web for it, and finally found some web
page written almost entirely in Chinese except for a little fragment of
Visual Basic source code which gave the value of AC_SRC_ALPHA.
Anyway.
AlphaBlending
is working in the splash-screen bitmap now. I don't see any evidence that it is
working anywhere else, but that may be because of other problems. Our
Transparency slider doesn't seem to be working, for example, and I'm not sure
the Transparency menu is working, either.
This morning I fixed a problem
that was causing document-window menus to go away after calling Track on a popup
menu. I'm starting to work now on some of our remaining bitmap-drawing problems.
Pattern fills, for example, seem to fill with a pattern that is only vaguely
related to the correct pattern, and gradients fill outside their boundaries.
I finally got BView::ClipToPicture
working well enough that we can clip patterns and gradients into non-rectangular
shapes, but it was one incredible pain in the butt. Here are a couple of things
I learned today that finally gave me a solution. First, when you have a GDI path
and you want to use SelectClipPath
to turn it into a clip region, WidenPath is
what determines whether the interior of the path will be included in the clip
region. If you widen the path, the resulting clip region will contain only the
path's outline; if you don't, the path's interior will be included. I had
originally added WidenPath
to ClipToPicture
because the first picture-clipping example I managed to get working under BeOS
had just two vertical lines. Since the picture had no interior, the WidenPath
was necessary; without it, the clip region was empty. There may still be
situations where we have to use WidenPath,
but my guess is that it's unlikely. We'll see. I have an assert in ClipToPicture
that will catch things like the no-interior situation; it detects and complains
about completely empty clip regions.
Second, if you call SetParent
to break the parent-child connection between two HWNDs, you need to make sure
the child is hidden first. Otherwise, Windows will invalidate at least
the parent of the parent, and maybe other HWNDs as well. My guess is that it's
not smart enough to figure out which HWNDs actually need to be redrawn, and just
invalidates everything in the hierarchy as a just-in-case measure. The problem
is that, if you do this while you're handling a WM_PAINT, it can set off a
sequence of WM_PAINTs that you can't escape, because each one will probably
trigger the same call to SetParent.
At least that's what was happening to us, down in the BView::RemoveChild()
that cleans up the offscreen bitmap into which we were drawing the picture used
to define the clip. This was a very difficult thing to find.
For at least the last six months or so, we've had to watch the Windows task bar flash several dozen times as the app launches. The reason was that BeOS BViews can be and often are created before they are attached to their parent windows. Under Windows, however, when you create an HWND without a parent, it's considered a child of the desktop and gets an entry on the taskbar, so every BView created during launch just for creation of offscreen bitmaps was briefly creating a taskbar button. I tried over the weekend to make sure that BView HWNDs are initially created invisible, since they shouldn't be visible anyway until they are attached to a window, but finally decided it would be a lot easier just to handle it in BBitmap's AddChild method. I do still think, though, that we have a more general problem here.
Here is a link that may come in handy pretty soon: "Creating
and Sending a Message " using Microsoft's CDO (Collaboration Data Objects)
API. It looks like it works only on Windows 2000, but it's not MAPI dependent
and the example looks simple enough that we might want to use it for a
Windows-2000-specific implementation of F)ile-S)end.
I made a lot of
progress yesterday. The native win32 menus in the fill control are working
again; they had lost their ability to handle clicks after the changes I made
last week to reduce side effects of the message-pump operations in
mouse-tracking loops, but they're working again -- we just needed to make sure
not to pump the win32 message queue while the mouse cursor is still inside the
menu.
And pattern-fills are now drawing correctly. The problem with the
monochrome patterns was that the buffer passed into CreateBitmap
needed to be word-aligned. The problem with the color patterns was just that we
were missing the "Ink Sets" folder underneath the app; I plan to add a step
which creates that folder to PromotePhoenixBuild.bat
today.
And alpha blending is now working on objects within documents
that use bitmapped fills. Getting it to work solid fills, line-drawing
operations, etc., is going to be tougher, but looks doable.
Shouldn't BFont::GetEscapements be doing a DotsToPoints conversion? It uses GetTextExtentExPoint , which despite the "Point" in its name, appears to return logical coordinates (i.e., "dots"). I have a feeling this is why our cursor no longer tracks inserted text in WP frames.
I made a good deal of progress over the weekend, but discovered some
disturbing things, too. I'll start with the bad news. First, it looks like using
AlphaBlend
to display a bitmap containing per-pixel alpha crashes the 166Mhz remote-debug
machine I use at home when it's running under Win98. It works fine under Windows
2000, and works with bitmaps other than the Phoenix splash screen. Second, it
looks like the DEVMODE
structure returned by the printer driver for our HP 932C contains bad dmPaperLength
and dmPaperWidth
values; it looks like they may be in millimeters instead of tenths of a
millimeter. If that information isn't going to be reliable, it will be worse
than not having it at all.
Mostly as a learning exercise to help us make
a better decision about whether to use libart under Windows, I also wrote a
couple of functions that will probably come in handy, particularly if I can make
them faster: FillRectUsingAlpha
and TextOutUsingAlpha.
They work like the win32 FillRect
and TextOut
functions, but take an additional transparency argument that is used
whenever it is less than 0xFF (opaque). Experimenting with using these functions
to draw semi-transparent spreadsheet frames exposed a potential problem with SelectClipPath;
it appears that, when you draw text in a GDI path, and then use SelectClipPath
to convert the path into a clipping region, the interiors of the characters are
clipped out instead of in. We will probably have to fix that with a RGN
operation of some kind.
From: Cole Brecheen To: dev@gobe.com Sent: Wednesday, November 22, 2000 11:28 AM Subject: cheap feature that could make us rich
So here's an idea. As you probably know, modern Windows apps that create
documents always have a Send-To command in the File menu that lets you e-mail
the document directly from the app. This has been a Windows logo requirement for
many years, and it is functionality that Windows users expect. A minimum
implementation is simple to add (see
http://msdn.microsoft.com/library/psdk/mapi/_mapi1book_adding_a_send_command_to_an_application.htm),
and is all that most non-Microsoft apps provide. Office 2000, however,
implements it very nicely with excellent translation of the document into HTML
e-mail, and has raised the de facto standard for this feature.
Our HTML
translator will let us easily do as good a job on Send-To as Office 2000 does.
We could go one step further, however, and add a feature that would make Gobe
Productive a must-have tool. Our implementation of Send-To could give users the
option of determining whether and exactly when each message was read by the
recipient. We could use the same technique that spammers are already using, but
that no one has yet thought to put in the hands of ordinary users: hit counts on
zero-sized image files uniquely named to identify each message and recipient.
The image files could live on the Gobe server, and we could provide a web
interface that would let people go to the Gobe web site and check whether and
when sent messages have been read. We could limit volume (10 receipts a day?
20?) to avoid being accused of aiding spammers.
This is a feature that
could get many many millions of users very quickly. That's not an exaggeration.
Personally, I would switch immediately to sending most of my e-mail from an app
that let me do this. I know that in certain very small corners of the computing
universe there is still resistance to HTML e-mail, but the plain fact of the
matter is that Scot Hacker lost this argument five years ago.
Numerically speaking, AOL was until recently the last significant bastion of
non-HTML-e-mail, and with the release of NetScape/AOL
6.0, it has been assimilated.
I also know some of you have concerns
about the privacy implications of e-mail that allows senders to know whether and
when messages have been read. All I can say is that it has already happened --
big companies (and big spammers) already have this technology and are using it
every day. It's just that no one has yet thought to give it to ordinary
individuals who need it for perfectly legitimate purposes.
Cole
Over the Thanksgiving break I worked on cursor management, and a bizarre Win98 DLL-loading problem: 45 seconds to load each DLL over a 10Mbps LAN connection -- problem goes away for a while after rebooting, but comes back consistently -- may be unique to the Win98 debug machine I use at home. I was beginning to reach the conclusion last night that the DLL-loading problem probably had something to do with calling AlphaBlend under Win98, but what the problem is, exactly, I don't know. Here in the office today I plan to focus on cursor management, but I'll check the DLL loading problem on a Win98 machine here.
I'm still working on cursor management this morning, but I think I'm getting
close to the end of it. I discovered a couple of things yesterday that were
contributing to our problems. The first was that, when you have overlapping
HWNDs, win32 sends a WM_SETCURSOR
message to both windows. Because BeOS
quite sensibly does not send mouse-move messages to obscured windows, I had to
add code to our WM_SETCURSOR handler that tries to make sure the HWND is topmost
before passing anything on to BeOS
code. The second was that WM_SETCURSOR was passing the mouse location in screen
coordinates instead of client coordinates.
With those problems fixed,
I'm still seeing several things that are wrong: (1) there is a zone about
1/4-inch wide at the left edge of a graphics document where the cursor does not
change when it should; (2) disabled EDIT controls on one of the tool ribbons are
setting the cursor, and setting it to the wrong shape, which then doesn't get
set back to the right shape when you move the cursor back into the client area;
(3) if you do it just right, you can move the cursor from outside to inside a
document window, which should cause it to become a resize cursor only for an
instant when it passes over the window edge, but have it remain a resize cursor
instead of an arrow. And then of course we still have a problem with failing to
detect the interiors of filled objects, which causes us to get a drag cursor
only on object edges.
I'm still struggling with cursor management problems today. Last night I discovered that, if you move the cursor to some position where it changes to the I-beam and then exit the app, the next time you launch, CreateCursor will ignore the image data passed to it and return an I-beam cursor instead. It's very, very strange, but demonstrably true. In fact, I discovered in the debugger that I could force CreateCursor to pay attention to the passed-in image data by modifying just one byte of the image data, creating a cursor from it and then destroying that cursor, then restoring the image data to its original state and creating the cursor again. At the moment I have code checked in which does exactly that, but it's obviously covering up some other problem. I thought last night that the problem surely must be a resource leak, and I did find and fix a failure to destroy the cursors that we create, but that didn't solve the problem. So today I'm going to clean up any cursor-related BoundsChecker warnings just in case there are any other cursor leaks, and then dig back in.
Aaarrrggghhhh! I've fixed all the win32 cursor leaks, and the problem is still there -- the I-Beam and crosshair cursors are randomly interchanged after the first appearance of the I-Beam. I thought for a while after fixing all the leaks that maybe windows is remembering the last cursor set by any member of the window class, and restoring to that cursor anytime we do a FORWARD_WM_SETCURSOR. But we get the same problem even with all FORWARD_WM_SETCURSOR's removed (and with them removed we have additional problems with failure to get correct cursors on win32 controls). I need some way of determining when a cursor has gone bad (i.e., has gotten interchanged with a different cursor).
I thought last night that maybe this cursor problem was behind me, because I made a few changes on a build I did at home, and then couldn't reproduce the problem anymore. I brought that build into the office this morning, however, and it has exactly the same problem I was seeing here yesterday. At this point, I'm looking for any hack, no matter how gross, that will make the problem go away. At the moment I'm trying the friendliest possible handling of WM_SETCURSOR messages -- I'm just forwarding them -- and setting the cursor only in WM_MOUSEMOVE handlers. It doesn't help. Next I'm going to try to figure out why it's just this one custom cursor -- the crosshair -- that gets interchanged with the system I-Beam cursor. If I understood that, maybe I could at least shift the problem to some lower-traffic cursor.
I finally managed to put all these cursor management problems behind me by using CopyImage to duplicate the arrow and I-beam system cursors before using them. It seems pretty clean and reliable. I still don't understand how we could have been changing the system cursors without calling SetSystemCursor, but we clearly were. I guess it will just have to go on the big unsolved mysteries pile.
I'm working today on hooking up libart and freetype so that we can switch between them and their native win32 equivalents using a runtime switch. My first goal is just to get everything building and running at our present level of functionality with all the libart and freetype stuff compiled and linked in, but turned off via the runtime switch.
A quick reminder to myself: TextOutUsingAlpha() is not setting the HDC origin correctly for non-opaque drawing. If we're going to continue using it, that needs to be fixed. This was why the Preferences-Dialog listbox quit working a few weeks ago.
We are now running reasonably well with all the libart and freetype code
compiled and linked in, but switched off at runtime. There do seem to be some
side-effects even with the runtime switch turned OFF. As you enter text in a WP
document, for example, it doesn't appear on screen until you press Enter. There
also seems to be something wrong with drawing handles for selected objects in
graphics documents -- the handles don't draw immediately, but then later
do draw. I'm not sure what the trigger is.
But it's good enough
to provide basic functionality while I work on getting us stable with the
runtime switch turned ON. That's what I plan to work on today.
I've been struggling for days with a Win98-specific GDI-memory leak, and it is killing me. It makes me want to give up and go work behind the counter at McDonald's. Under Win98, we're exhausing the 16-bit GDI heap very rapidly (sometimes in less than a minute), after which the whole OS is prevented from drawing. HeapWalk shows the GDI heap filling up with hundreds of 48-byte fixed objects of unknown type. I have a feeling that they are DIB-sections. I'm about to test that theory now.
It's not a DIBSection, although I still don't know exactly what it is. I do know, however, that I can make the problem go away in one particular stretch of app code by turning off calls to BView::PushState and BView::PopState.
I finally determined exactly what was causing all those mysterious 48-byte
fixed objects of unknown type that were filling up and fragmenting the 16-bit
GDI heap. They appear anytime you delete a font that is still selected into an
HDC. The delete appears to succeed -- it returns TRUE (meaning success), and
leak-finding tools such as BoundsChecker
don't detect anything wrong. But you do run out of 16-bit GDI heap space if you
let more than a few hundred of these fixed-location objects accumulate. Win98
then quickly loses its ability to draw, and operations such as CreateCompatibleBitmap()
start to return NIL with no sensible explanation via GetLastError().
Guaranteeing that our BeEmulation
subsystem never deletes selected fonts has turned out to be harder than I
expected. I spent ten or fifteen hours on it over the weekend, and got close,
but didn't succeed in getting a completely clean run through all the asserts I
added to detect this situation after determining what it was. I did finally
succeed about an hour ago, however, so I'm now packaging up a build that I can
turn over to the testers.
We wound up not delivering anything to the testers yesterday because of this god-awful fill control that's now broken again. Scott Lindsey changed it in some way that makes it easier to support under Linux, but I don't see any way to tweak the BeOS code to eliminate its dependence on a separate menu-tracking thread. So I'm going to quit working towards a clean non-threaded emulation. We just need something that works, and that won't break again.
It's hard to be productive when you're working in the app (as opposed to the
BeEmulation
DLL) because the app takes so long to link. Any change in the app involves at
least four or five minutes of build time on my machine. It does allow a little
extra time to think, however.
Anyway. Now that I know the app's
fill-control code isn't even used under Linux, I'm beginning to see how we can
make it work under Windows again.
I'm having trouble with an assertion failure that claims we are not detaching a BMenu from a BWindow before destroying it. Scott Lindsey says it points to a deeper problem in the way BMenus fit into the BWindow/BView hierarchy under Windows. I think he's right; I've tried before just to ignore the underlying problem here, which is that in the win32 version of this emulation BMenus claim to have BWindow owners even when they aren't visible. Ignoring it always seems to lead to some other complication, however, so I'm going to try and straighten out this whole mess over the weekend.
Today is officially a holiday, but I'm working at home on this BMenu restructuring problem. It's going reasonably well. I've eliminated a lot of complexity by separating a bunch of BMenu and BMenuBar code and deleting a whole class (BMenuBarWindow) that I don't think will be necessary anymore. Since BMenuBars are always visible when the document window itself is visible, it just causes needless confusion to treat the BMenuBar as a BMenu.
I'm still working on BMenu restructuring. It has turned out to be harder than I thought, but I do have menus working again. We still have crash-on-exit problems, however, resulting from confusion over what code is responsible for deleting the BWindows that we create as owners of B_MATRIX_LAYOUT menus. I expect to get those problems resolved within the next few hours, and should be able to check all this stuff in before I leave this evening.
Rats. Carl noticed this morning that my menu changes broke keyboard accelerators, and introduced menu-like side-effects (underlined characters) in some controls that aren't really menus under win32 (i.e., combo boxes). I'm fixing those now. And I still have a crash-on-exit problem that needs to be fixed before we can give a stable win98 build to the testers. I'd like to do that by Friday at the latest.
I think I've finally solved (or at least found a work-around for) an
extremely frustrating build error that started plagueing us a couple of weeks
ago. The problem has been that the linker sometimes fails because of an error in
CVTRES, which complains about duplicate resources and a corrupt Phoenix.res
file. There were never actually any duplicate resources that I could find
anywhere, however. The problem was somewhat sporadic -- we'd see it happen, and
sometimes just relinking would make it go away. Other times it would appear to
go away after we deleted the Phoenix.res file, or after we deleted some of the
files that the linker apparently uses to do incremental links of Phoenix.exe.
But sometimes even that wouldn't work, and we'd resort to getting new resources,
which pretty much requires getting new sources to match, and which generally
involves rebuilding everything in sight, at a cost of at least an hour or two in
build time. This past Friday, even that wasn't making the problem go
away, and it caused us not to be able to produce a stable Win98 build on the
last work day of the year. Not good.
So I packed everything up on Friday
and decided to make it a top priority over the weekend to nail this problem
down. I built at home Friday night and got immediately into the problem state,
and then set about whittling the problem down to determine what was causing it.
What I finally concluded was that, at a certain size threshhold (our .res file
is now about 580K) CVTRES can no longer reliably convert it to COFF format
in-place when MSLink invokes it. It's not a problem with any of the inputs to
RC.EXE or with the Phoenix.res file itself. You can make the problem go away
just by running CVTRES from the command line to convert Phoenix.res to COFF
format before the link so that MSLink won't try to do it. If you do this
specifying a different output file name, it always seems to succeed. I expect I
will modify our makefile on Tuesday to do this automatically every time we link.
If you're looking for an easy way to make a living, don't pick software
development. These last couple of days back in the office after the long weekend
have been extremely frustrating. The workaround I thought I'd found on Sunday
for the corrupt-Phoenix.res-file problem didn't hold; I started getting the
error again first thing yesterday morning, and converting the .res to COFF by
hand didn't help. So, since the CVTRES error message continued to complain about
duplicate resource IDs, I modified the .rc-generation procedure to sort the
resource ID list by type just to help track down any possible duplicates. And,
oddly enough, that made the problem go away -- at least for now.
With
that problem finally at bay, the debug build was looking pretty decent under
Win98 early yesterday afternoon, so I started trying to package up a release
build for delivery to testing. It, too, looked good -- better than it ever has
in most respects -- except for one crippling problem: a crash on exit that
required a machine restart. Not to worry, I thought; since our build process
always links non-debug executables a second time with a symbol table, I
can just debug the problem using those debuggable release-mode executables.
Unfortunately they turned out not to be debuggable after renaming them,
for example, from "Debuggable_Phoenix.exe" to "Phoenix.exe" -- apparently some
reference in the .pdb file to the original name of the .exe or .dll as produced
by the linker -- so I had to make some very tedious and not-very-pretty changes
in the link process that allow the debuggable release-mode executables to be
linked using the same names as the non-debuggable versions. That finally works
now, after four or five hours of hair pulling, so maybe I'll be able to
figure out what is causing the release-build-only Win98 crash on exit problem.
I have tear-off menus pretty much working again now. This is at least the second and maybe even the third or fourth time I've gotten them to work acceptably and then had to go back and fix some major break. That's probably an indication that the implementation is a little too fragile. This time, it turned out that the problem was mostly related to mouse capturing. Scott Lindsey rewrote the app code to use more conventional mouse handling techniques; unlike Tom's previous implementation, popped-up menus now capture mouse events. Unfortunately, the win32 version of the emulation had some holes in its support for BeOS-style mouse capturing, and my drag-rectangle code was already using the win32-level SetCapture and ReleaseCapture. So I had to plug the holes by adding SetCapture support to our emulation of SetEventMask and by making our WndProcs aware that certain BViews might be eligible to get mouse events on the basis of their event masks. I also added code to the DragRect wndproc that forwards mouse events to mouse-capturing BViews even though the transparent DragRect HWND already has the mouse captured at the win32 level.
This morning, because of another crash-on-exit problem in the BMenu destructor that we discovered late Friday evening, I'm trying to completely sever all parent-child relationships between BWindows and BMenus. It looks like the right thing to do, and I've already got most everything to work with all the AddChild / RemoveChild stuff deleted from BMenu.cpp. But I'm left with a crasher in the fill control that I haven't yet tracked down. My best guess is that I'll have it all straightened out by noon.
I wound up not being able to break the parent/child relationship between
BWindows and BMenus that seemed to be causing us problems on Friday. I fixed the
crash-on-exit problem by removing some BWindow-creation code in BMenu AttachedToWindow
that was duplicated in BMenu::Show. We may have to revisit this BMenu
implementation, however. I just hope it can wait until after this version ships.
Since I'm waiting on some code from Bruce before I can make more
progress with using LibArt
to draw document window contents, I think this next task for me is likely to be
drag-and-drop. The first step will be to change our Win32-clipboard handling
code so that it uses the OLE
clipboard instead.
I spent several hours last night struggling again with the MSVC profiler, but finally got good data out of it. It showed exactly where the worst of our speed problems are in dragging around large objects. It turned out that they resulted largely from a huge number of WM_MOUSEMOVE events generated because attaching a BView to a BBitmap for offscreen drawing purposes creates an HWND, which generates WM_MOUSEMOVEs, which causes us to create more offscreen bitmaps (it's part of a technique we use for determining whether the cursor is hovering over the interior of a filled object). The net effect was that we were getting and processing WM_MOUSEMOVEs constantly, even when the cursor was completely still.
I made a lot more progress yesterday on the speed problems we were having with dragging large objects around in a graphics document. There were two places in mousetracking loops we were allowing WM_MOUSEMOVES and WM_TIMER messages to be dispatched and handled. I fixed those by adding a new PumpAllButSpecifiedWindowMessages() function. Drag speed now seems fine. Bruce thinks there may still be a speed problem in his gradient rendering code, but otherwise this particular performance problem seems to be behind us.
So, drag-and-drop (i.e., OLE clipboard) support is still the next big task for me, but I think that first I need to fix some menu tear-off problems that have crept in over the last month or two. I also need to get printing to work with freetype/libart turned on so we can see what that's going to be like.
Yesterday I finally fixed what I hope are the last of our problems with tear-off menus. I had to add code that dealt with ending the drag-rectangle-drawing operation as a separate case, just as it has been dealing with starting the operation as a separate case. And I had to change the sequence of events involved in destroying the transparent full-screen window that we use as a drawing surface for the drag rectangle, because certain information associated with it has to survive long enough to let us erase the last rectangle drawn. Today I have to re-enter crash-on-exit hell -- another one seems to have crept in over the weekend.
I wound up spending most of yesterday afternoon and another couple of hours this morning just trying to get our floater activation states right. It was frustrating. We need to hide all floaters (1) when all document windows are minimized, (2) when the app is deactivated by activation of another app, and (3) when a modal dialog appears, and then we need to show them all on the converse of these events. Unfortunately, anytime we show or hide a window, win32 generates activation and deactivation events for other windows, and some of these events are also the ones we're handling to hide and unhide floaters. It gets complicated.
I think it's working okay now, however, with one exception: Holdaway says there are cases where floaters bring up modal dialogs, and in those situations it wouldn't be good to hide the dialog's parent window.
At the moment I'm trying to fix the fill control again. It turned out that a cross-platform bug I'd been seeing with a tab drawn in the wrong place on a second invocation of the control was a side effect of code I had added that abused the meaning of a certain member variable. Tom deleted the code, and now the control is broken again. He suggested adding another member variable instead of overloading the meaning of an existing one. That's what I'm experimenting with now, but I don't relish having to get back into this code.
Over the weekend I fixed most of our vertical scrolling problems -- horizontal scrolling is still not working very well at all. I also fixed several build problems that seem to have appeared out of nowhere. The profiler, for example, quit working because it claimed our executable was linked with a /FIXED switch. That has never been true, but it turns out that /FIXED has always been the default. So how was I able to profile before this weekend? I added a /FIXED:no switch to the makefile and now the profiler works again.
I'm working today on getting printing to work via libart and freetype. It's a mess, but I think it's going to work okay. I can't tell yet how big the print-spooler files are going to be, but I'm structuring the code to support a "Faster printing" user-settable preference that will let us print through GDI if we really need to.
While I'm thinking about it, here is a good article about how AppleWorks currently being used in the K-12 market that makes clear just how vulnerable Apple is going to be to any competitor capable of offering the same or better functionality in a well supported cross platform package.
I solved an irritating little printing problem today that turned out to be just an algorithmic error on my part, and so we are now able to print good-looking text that isn't really text, so far as the printer driver is concerned; instead, it's lots and lots of lines and bezier curves. I'm turning my attention now to printing graphics documents.
I finally managed last night to get a complex document to print reasonably well. I also produced a competitive analysis grid showing how we can position the app to win. Looks like I should have included a new entrant in the Windows market: EasyOffice 2001.
I just finished having to work around a bizarre set of problems that showed up for the first time on Peggy's Win98 machine: a Compaq Presario 1200 laptop. The root of it all seemed to be that all of the mutex- and semaphore-related APIs hang inexplicably on her machine. Since at the moment we're not multithreaded, I worked around it by adding tests that disable all those calls when we're running on a non-NT-based platform, but that's not a good long-term solution. We're soon going to really need at least one separate thread for handling the get-from-web process, so this is something I'm going to have to figure out. For now, however, these changes get the app running smoothly on her machine, and probably on some percentage of other Win98 machines as well.
The printing problem I've been working on all day is just too hard, so I've put it aside for a little while to try and fix a problem with incorrect positioning of the WP cursor. My current theory there is that we've got rounding errors building up in FTFace::GetAdvance.
The cursor positioning problem turned out to be even harder than the printing problem.
The learning curve on OpenGL is awfully steep. I've probably spent a good solid day now just trying to get it initialized correctly so that we can create a gl drawing context. My current theory is that it has some unstated assumptions about the HWND referenced by the DC for which you have to set a gl Pixel Format. Does the HWND have to be WS_VISIBLE at the time you set a Pixel Format for its DC? I'll have to determine the answer empirically, because the OpenGL docs don't say.
With a little help from an MSDN article on drawing to DIBs with OpenGL, I finally got OpenGL to initialize correctly under Windows just a little while ago. We still don't get 3D charts, but I'm pretty sure the reason is that the bitmap into which we're trying to draw starts out at zero size. I need to add code to re-initialize OpenGL when the size of the bitmap changes. I'm pretty sure I'm on the right track because Bruce made a similar change in the Linux version a couple of weeks ago, which I didn't understand at the time. Now I do.
Resizing OpenGL charts at the right time did work -- they are beginning to draw now -- but they are drawing upside down. They also seem to be distorted in some way that may be a row-stride issue.
OpenGL charts are now working at a sort of minimum level. They are being created at the wrong resolution, and the axis labels don't line up very well with the bars, but otherwise they look and act pretty much like they do under BeOS. Bruce says he thinks the resolution problem is also happening on Linux, so I'm going to put it aside for a few days so he can look at it there.
Today I'm going to try to get tooltips working again. The fundamental problem appears to be a window activation issue: creating the tooltip window through the regular BeEmulation BWindow mechanism deactivates the document window, with all kinds of side effects including destruction of the tooltip window we just created. So I'm thinking of trying to use the regular win32 tooltip control instead.
Tooltips are working again, with just a couple of lingering problems: the yellow tip-text rectangle is a little too large for the text, and we get a bit of flashing elsewhere in the UI when a tooltip appears. The first problem is probably the result of a dots-to-points or points-to-dots conversion that we're not doing somewhere. I don't know about the second.
Getting these things to work turned out to be a very typical Windows development experience. I spent most of Friday and a big chunk of the weekend trying to implement our tooltip support through the win32 common tooltip control, but never got the control to work at all, and still don't know exactly what I was doing wrong. I realized on Sunday, however, that using the common tooltip control as it was really intended to be used would require a lot of app level changes -- the common tooltip control doesn't provide a clean API for simply displaying a tooltip at a certain location for a certain length of time -- so I decided yesterday to go back to the BeEmulation method, which had been fairly close to working before. I solved the activation-event problem pretty quickly, and before noon had tooltips displaying correctly, but with the wrong background color. No problem, I thought: the emulation draws tooltips with static EDIT controls, and the mechanism for changing colors in EDIT controls is well documented. Right. I spent the rest of the day tinkering with code that should have worked, but didn't; I tested dozens of variations and theories about why it wasn't working, and finally discovered that all of Microsoft's instructions silently assume that you are smart enough first to make sure that the HDC's background mode is TRANSPARENT instead of OPAQUE. That's something I'll remember for six months or a year, but eventually I'm sure I'll have to learn it the hard way again.
I now have the Settings button in the File-Open dialog working very nicely. It did take more than half of yesterday to figure out exactly how to subclass the common file dialog these days. Since the first time I had to do this back in Win3.1 we've gone through at least two and maybe three generations of the standard technique for subclassing common dialogs, and there seem to be no examples of the currently approved method anywhere in MSDN or on the web. The SDK documentation is complete and accurate, however, so it's not too difficult just to follow the instructions. Now that I think of it, I probably could have found an example somewhere in the Platform SDK sample code, but chances are that it would have been unnecessarily complicated since I only needed to add one button.