QoS - is WRED really a congestion avoidance mechanism
WRED is a QoS mechanism that is used as a proactive measure to avoid queue exhaustion, not congestion. When a packet arrives at an interface, if the interface is not yet at capacity, the packet will be served immediately. It will not enter the queue. A packet will enter the queue only if there is already congestion on the link.
Now documentation does use the term "congestion avoidance" but in actual fact, it's "tail drop avoidance" that we are performing, for the purpose of ensuring that TCP connections will not go into slow start, thus slowing traffic down in general.
Congestion can also be induced "artificially" so to speak. If you use a class map to limit traffic of a certain type on an interface, such as bulk traffic as you mention, then you are creating in essence, a situation where congestion would take place. It may not be congestion because you reach the physical limit of the interface, but the configured limitation you set yourself. If that takes place, then you can then configure WRED to function simultaneously. But again, congestion must occur before a queue begins to fill up.